8000 Linker Improvements by rampantmonkey · Pull Request #2 · cooperative-computing-lab/cctools · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Linker Improvements #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 19, 2013
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions doc/makeflow.html
Original file line number Diff line number Diff line change
Expand Up @@ -442,6 +442,26 @@ <h2>Makeflow Log File Format</h2>
<li><b>dag_gc_collected</b> - the total number of files has been collected so far since the start this Makeflow execution.</li>
</ul>

<h2>Linking Workflow Dependencies</h2>
<tt>Makeflow</tt> provides a tool to collect all of the dependencies for a given workflow into one directory. By collecting all of the input files and programs contained in a workflow it is possible to run the workflow on other machines.

Currently, <tt>Makeflow</tt> copies all of the files specified as dependencies by the rules in the makeflow file, including scripts and data files. Some of the files not collected are dynamically linked libraries, executables not listed as dependencies (<tt>python</tt>, <tt>perl</tt>), and configuration files (<tt>mail.rc</tt>).

To avoid naming conflicts, files which would otherwise have an identical path are renamed when copied into the bundle
<ul>
<li>All file paths are relative to the top level directory.</li>
<li>The makeflow file is rewritten with the new file locations and placed in the top level directory.</li>
<li>Files originally specified with an absolute path are placed into the top level directory.</li>
<li>Files with the same path are appended with a unique number</li>
</ul>

Example usage:
<pre>
% makeflow -b some_output_directory example.makeflow
</pre>



<h2>For More Information</h2>

For the latest information about Makeflow, please visit our <a href=http://www.cse.nd.edu/~ccl/software/makeflow>web site</a> and subscribe to our <a href=http://www.cse.nd.edu/~ccl/software>mailing list</a>.
Expand Down
6 changes: 6 additions & 0 deletions doc/man/makeflow.m4
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,12 @@ LONGCODE_BEGIN
makeflow -T wq -a -N project.name Makeflow
LONGCODE_END

Create a directory containing all of the dependencies required to run the
specified makeflow
LONGCODE_BEGIN
makeflow -b bundle Makeflow
LONGCODE_END

SECTION(COPYRIGHT)

COPYRIGHT_BOILERPLATE
Expand Down
62 changes: 55 additions & 7 deletions makeflow/src/makeflow.c
10000
Original file line number Diff line number Diff line change
Expand Up @@ -306,16 +306,64 @@ void collect_input_files(struct dag *d, char* bundle_dir, char *(*rename)(struct
list_delete(il);
}

char* bundler_rename(struct dag_node *d, const char *filename)
/* When a collision is detected (a file with an absolute path has the same base name as a relative file) a
* counter is appended to the filename and the name translation is retried */
char* bundler_translate_name(const char *filename, int collision_counter)
{
if (filename[0] == '/'){
const char *new_filename;
new_filename = (char *)malloc(PATH_MAX * sizeof (*new_filename));
new_filename = string_basename(filename);
static struct hash_table *previous_names = NULL;
static struct hash_table *reverse_names = NULL;
if ( !previous_names )
previous_names = hash_table_create(0, NULL);
if ( !reverse_names )
reverse_names = hash_table_create(0, NULL);

char fn[PATH_MAX];
if (collision_counter){
sprintf(fn, "%s%d",filename,collision_counter);
}else
sprintf(fn, "%s", filename);

const char *new_filename = malloc(PATH_MAX * sizeof (char));
new_filename = hash_table_lookup(previous_names, fn);
if ( new_filename )
return xxstrdup(new_filename);

new_filename = hash_table_lookup(reverse_names, fn);
if ( new_filename ){
collision_counter++;
return bundler_translate_name(fn, collision_counter);
}
else
return xxstrdup(filename);
if (filename[0] == '/'){
new_filename = string_basename(fn);
if(hash_table_lookup(previous_names, new_filename)){
collision_counter++;
return bundler_translate_name(fn, collision_counter);
}
else if(hash_table_lookup(reverse_names, new_filename)){
collision_counter++;
return bundler_translate_name(fn, collision_counter);
}
else {
hash_table_insert(reverse_names, new_filename, fn);
hash_table_insert(previous_names, fn, new_filename);
return xxstrdup(new_filename);
}
}
else {
hash_table_insert(previous_names, fn, fn);
hash_table_insert(reverse_names, fn, fn);
return xxstrdup(fn);
}
}

char* bundler_rename(struct dag_node *n, const char *filename){

if (n) {
struct list *input_files = dag_input_files(n->d);
if(list_find(input_files, (int (*) (void *, const void *)) string_equal, (void*)filename))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing one const in the function cast.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a second const results in a warning.

makeflow.c:359: warning: passing argument 2 of ‘cctools_list_find’ from incompatible pointer type

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll check list.h, and list.c...

return xxstrdup(filename);
}
return bundler_translate_name(filename, 0); /* no collisions yet -> 0 */
}

void dag_show_output_files(struct dag *d)
Expand Down
4 changes: 3 additions & 1 deletion makeflow/src/visitors.c
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ See the file COPYING for details.
#include "hash_table.h"
#include "xxmalloc.h"
#include "list.h"
#include "stringtools.h"

#include "visitors.h"

Expand All @@ -30,7 +31,8 @@ int dag_to_file_vars(const struct dag *d, FILE *dag_stream)

hash_table_firstkey(vars);
while(hash_table_nextkey(vars, &var, &value)) {
fprintf(dag_stream, "%s=\"%s\"\n", var, (char *) value);
if(!string_null_or_empty(value) && strcmp(var, "_MAKEFLOW_COLLECT_LIST"))
fprintf(dag_stream, "%s=\"%s\"\n", var, (char *) value);
}

return 0;
Expand Down
39 changes: 39 additions & 0 deletions makeflow/test/TR_makeflow_linker_collision.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
#!/bin/sh

. ../../dttools/src/test_runner.common.sh

out_dir="linker_collision_out"

prepare() {
if [ -d "$out_dir" ]; then
exit 1
fi

echo "t" > /tmp/asdf
echo "a" > linker/asdf

cd ../src/; make
exit $?
}

run() {
cd linker
../../src/makeflow -b "$out_dir" collision.mf

files=`ls "$out_dir" | wc -l`
if [ $files != "3" ]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that there are only three files, please check that each of them was generated with the correct name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about checking each file, but that would become a very brittle test. If the translate function changes in any way the file names will change, but there will always be three.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What worries me is that we do care about the names, since they have to match the names in the makeflow file. What if we copy with a name but write another to the makeflow file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess what we should be checking is that the names in the makeflow file correspond to the files in the bundle directory. I will work on this tomorrow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking that the names in the makeflow file correspond to the copied files is unnecessary. Both are renamed with the same function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same function, yes, but a function with internal state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function does not have an internal state. It is external, otherwise it wouldn't be shared between calls. The state also ensures that every reference to a file is mapped identically.

Testing this function with our current framework would require re-parsing both the old and new makeflow files, computing the differences, and comparing the differences with the file system.

This is a prime candidate for unit testing.

exit 1
fi

exit 0
}

clean() {
cd linker
rm -r "$out_dir"
rm /tmp/asdf
rm asdf
exit 0
}

dispatch $@
66 changes: 66 additions & 0 deletions makeflow/test/TR_makeflow_linker_complex.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
#!/bin/sh

. ../../dttools/src/test_runner.common.sh

out_dir="linker_complex_out"

prepare() {
if [ -d "$out_dir" ]; then
exit 1
fi

touch /tmp/makeflow_test_complex_path
mkdir -p /tmp/a/b/x
touch /tmp/a/b/x/y
chmod 777 /tmp/a/b/x/y
cd ../src/; make
exit $?
}

run() {
cd linker
../../src/makeflow -b "$out_dir" complex.mf

if [ ! -f "$out_dir"/makeflow_test_complex_path ]; then
exit 1
fi

filename="$out_dir"/y
if [ ! -f $filename ]; then
exit 1
fi

case `uname -s` in
Darwin)
cmd=`stat $filename | awk '{print $3}'`
;;
*)
cmd=`stat -c %A $filename`
;;
esac

if [ ! "-rwxrwxrwx" == "$cmd" ]; then
exit 1
fi

first_line=`head -n 1 $out_dir/complex.mf`
if [ ! $? == 0 ]; then
exit 1
fi
if [ ! "VARIABLE=\"testing\"" == "$first_line" ]; then
exit 1
fi

exit 0
}

clean() {
cd linker
rm -r "$out_dir"
rm -r /tmp/a
rm -r /tmp/makeflow_test_complex_path
exit 0
}

dispatch $@

2 changes: 2 additions & 0 deletions makeflow/test/linker/collision.mf
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
: asdf /tmp/asdf
echo ""
7 changes: 7 additions & 0 deletions makeflow/test/linker/complex.mf
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
VARIABLE = "testing"

: out.put
echo ""

out.put : /tmp/makeflow_test_complex_path /tmp/a/b/x/y
touch out.put
0