ART is an ancestral reconstruction tool . . .

NAVIGATE: overview | getting started | installation | contents | command-line syntax | projects | ART output

Command-line Syntax

To run ART, go to the $ART directory ($ART directory?) and use the following syntax:

art (command) (arguments)

Where valid commands are:

	calculate
findanc
showtrees
clean
help

ART accepts one command at a time. However, each command can be followed by any number of arguments. The arguments can be listed in any order. Read further for a description of each command and argument.


calculate

'calculate' initiates codeml. After codeml finishes, the ART database will be filled with data. (ART database?)

Optional arguments:

-p (project name)

The -p argument specifies the current working project. If this argument is not used, then by default ART picks the most recently used project.

-d (database name)

The -d argument specifies the current working database. If this argument is not used, then by default ART picks the database it used last time. In most cases, this argument does not need to be invoked.

-treefile (input tree filepath)

The -treefile argument specifies the path to a file containing phylogenetic trees. If this argument is not used, ART will look for a file named '$ART/(project name)/input_trees.txt', where (project name) is the name of current working project. Please consult CodeML documentation for more information about the tree file.

-seqfile (input sequence file)

The -seqfile argument specifies the path to a file containing descendant sequences. If this argument is not used, ART will look for a file named '$ART/(project name)/input_sequences.txt', where (project name) is the name of current working project. Please consult CodeML documentation for more information about the sequence file.

-modelfile (input model file)

The -modelfile argument specifies the path to a file which contains the name of an evolutionary model. If this argument is not used, ART will look for a file named '$ART/(project name)/input_model.txt', where (project name) is the name of current working project. The file should be formatted with a single line, for example:

wag.dat

The evolutionary model should be located in the following directory:

/common/share/paml/dat/

For our example, this means that /common/share/paml/dat/wag.dat should exist. In future versions of ART, this restriction will be fixed.

Examples:

Example 1:

art calculate -p worms -treefile ./myfiles/worms-trees.txt -seqfile ./myfiles/worm-sequences.txt
           -modelfile ./myfiles/my-favorite-modelname.txt

Description: creates a project called 'worms'. Runs codeml, using specified tree file, sequence file, and model file. By default, the generated data will be stored in the current working database.

Example 2:

art calculate -p reptiles -d ancrecon

Description: creates a project called 'reptiles'. Runs codeml, using the default input files located at '$ART/reptiles/input_trees.txt', '$ART/reptiles/input_sequences.txt', and '$ART/reptiles/input_model.txt'. The generated data will be stored in the ancrecon database.


findanc

'findanc' finds the ancestor(s) for the user-specified descendant list.

IMPORTANT: findanc will produce garbage results if the MySQL database does not contain any data. To fill the database, use the 'calculate' command.

Output

findanc produces several output files, which are written to the project directory. (You can specify the current working project with the -p argument). The findanc output files are:

sequence_table.txt
Description: The table of ancestral states for each reconstructed ancestor. This file can be renamed with the '-o' argument.

map_ancestral_sequence.txt (when the '-map' argument is used)
Description: The maximum a posteriori ancestral sequence, with a table showing the probability for each state on each site.

reconstructed_sequences.txt
Description: A table with tree ids, and the corresponding reconstructed ancestral sequence.

tree_node_pairs.txt
Description: A table with tree ids, and the corresponding taxon ID for the reconstructed ancestor.

Required arguments:

(descendant list)

The descendant list must contain two or more descendant taxons. Descendant taxons can be IDs or names. The list can be space-seperated or it can be a range.

For example,

art findanc 45 35 -p (project name)

will find the ancestor for taxons #45 and #35.

art findanc 13-35 -p (project name) -outgroup 67 crocodilePR

will find the ancestor for the range of taxons #13 through #35, using a tree rooted with outgroup taxa 67 and crocodilePR.

art findanc oreoPR gatorPR jayPR -p (project name)

will find the ancestor for the taxons with names findanc, oreoPR, and jayPR.

Optional arguments:

-outgroup (outgroup list)

The -outgroup argument specifies the desired rooting of the ancestral tree. The outgroup list can be single taxa names, IDs, or a range of IDs.

-p (project name)

The -p argument specifies the current working project.

-d (database name)

The -d argument specifies the current working database. If this argument is not used, then by default ART picks the database it used last time. In most cases, this argument does not need to be invoked.

-t (tree specifier)

If the -t argument is used, findanc will find ancestors only for the specified trees. You can specify trees as individual IDs, a confidence interval, or a probability cutoff.

For example,

art findanc (descendant list) -p (project name) -t 1 4 2

will find ancestors for trees 1, 4, and 2.

art findanc (descendant list) -p (project name) -t ml

will find the ancestor only for the most likely tree.

art findanc (descendant list) -p (project name) -t ci:0.80

will find ancestors for the trees in an 80-percent confidence interval.

art findanc (descendant list) -p (project name) -t gt:0.30

will find ancestors for the trees with a probability greater than 30 percent.

art findanc (descendant list) -p (project name) -t all

will find the ancestors for all trees.

-s (state specifier)

Some ancestral sites have multiple state possibilities. The -s argument allows you specify which states will appear in the output. You can choose to show all the states, only the most-likely state, or you can specify a confidence interval or probability cutoff.
For exampe,

art findanc (descendant list) -p (project name) -s ml

will force findanc to output only the most likely ancestral state.

art findanc (descendant list) -p (project name) -s ci:0.60

will force findanc to output ancestral states within a 60-percent confidence interval.

art findanc (descendant list) -p (project name) -s gt:0.40

will force findanc to output only those ancestral states with a probability greater than 40 percent.

art findanc (descendant list) -p (project name) -s all

will cause findanc to output all possible ancestral states.

-map

The -map argument finds the maximum a posteriori (MAP) sequence of the ancestor, integrated over all trees. The MAP output is written to a file named map_ancestral_sequence.txt.

For example,

art findanc (descendant list) -p (project name) -outgroup (outgroup list) -map

More examples:

Example 1:

art findanc 13-35 -p worms -s gt:0.30

Description: Find the ancestor for the range of taxons 13 through 35, using the data in the 'worms' project. By default, ancestors will be found for every tree. The output will contain any states with a probability greater than 30 percent.

Example 2:

art findanc gatorXYZ croc123 -t ml -s all -map -p reptiles

Description: Find the ancestor for the taxons 'gatorXYZ' and the 'croc123', using the data in the 'reptiles' project. Only find the ancestor for the most likely tree, but output all possible states. Also, find the maximum a posteriori sequence.

Example 3:

art findanc 60 61 62 -p reptiles

Description: Find the ancestor for taxons 60, 61, and 62. Use the data in the 'reptiles' project. By default, find the ancestor(s) for all trees and output sequence information for all possible states.


showtrees

'showtrees' will produce a table with each tree ID, the tree (in parenthetical notation), and the probability of each tree.

Output:

showtrees produces one output file, which is written to the directory for the current working project.

tree_info.txt
Description: A table of tree IDs, the parenthetical tree, and the probability of that tree.

Optional arguments:

-p (project name)

The -p argument specifies the current working project.

-d (database name)

The -d argument specifies the current working database. If this argument is not used, then by default ART picks the database it used last time. In most cases, this argument does not need to be invoked.

-t (tree specifier)

If the -t argument is used, showtrees will find tree information only for the specified trees. The -t argument can also be used with the findanc command (see above). The desired trees can be specified as individual tree IDs, a confidence interval of trees, or a probability cutoff. If the -t argument is not used, showtrees will find information for all trees.

For example,

art showtrees -t 1 4 2 -p (project name)

shows information for trees 1, 4 and 2.

art showtrees -t ml -p (project name)

shows information for the most likely tree.

art showtrees -t ci:0.80 -p (project name)

shows information for the trees in an 80-percent confidence interval.

art showtrees -t gt:0.30 -p (project name)

shows information for the trees with a probability greater than 30 percent.

art showtrees -t all -p (project name)

shows information for all trees.

More examples:

Example 1:

art showtrees -p worms

Description: Show information for all the trees in the 'worms' project.

Example 2:

art showtrees -p worms -t ci:80 

Description: Show information for the trees in the 'worms' project. The output will only contain trees in the 80-percent confidence interval.

Example 3:

art showtrees -t ml -p reptiles

Description: Show information for the most likely tree in the 'reptiles' project.


clean

'clean' will delete a project. (To create a project, use the 'calculate' command with the -p argument).

Required arguments:

-p (project name)

The -p argument specifies the current working project. When used with the clean command, -p specifies which project should be deleted.

Optional arguments:

-d (database name)

The -d argument specifies the current working database. If this argument is not used, then by default ART picks the database it used last time. In most cases, this argument does not need to be invoked. When used with the clean command, -d specifies the target database for table removal.

Example:

art clean -p worms

Description: The project named 'worms' will be deleted, including any relevant tables in the current MySQL database.


help

'help' will display syntax information and provide other useful tips.

Arguments:

There are no arguments for the 'help' command.

Example:

art help