Galaxy Data Libraries

Galaxy data libraries provide a way to conveniently share Galaxy datasets within a group of Galaxy users or with everybody that has access to a specific instance of Galaxy. The biggest advantages of the data libraries are:

  • You can import data from filesystem into the library without copying it.
  • You can import whole directories preserving the folder structure within.
  • The dataset's size does not count towards your usage quota.

Interface

You can explore the annotated interface:

Importing datasets

Given you have the permissions you can create folders and import datasets into a library.

from history

Select the Add datasets/from History at the folder toolbar. import menu

Choose history and select datasets you want to import into current folder. Hit Add. history import modal

from user folder

This option is available only if admin of the Galaxy instance configured it.

Select the Add datasets/from User Directory at the folder toolbar.
import menu

Select files you want to import. You can set Type and Genome if applicable.
file import

Select folders to be imported. You can choose whether to preserve directory structure (otherwise it will be flattened) and if you want to link instead of copying. Talk to your administrator whether to use linking to files instead of copying - it depends on your file preservation policy.
folder import

Administration

Every library has to be created (and deleted) by an admin user but subsequently can be filled with datasets and folders by anyone with permissions. It has multiple levels of access control for the data inside ranqing from 'public' to 'one person only'.

Import configuration

User folder

As an admin you can allow users to import datasets to libraries from a configured folder. This is enabled by setting user_library_import_dir in the config/galaxy.yml to a path accessible by the user that runs the Galaxy process. Galaxy expects the folder to contain subfolders named after email addresses of your instance's users. Each user will only see the contents of their folder. A common setup is that the value for user_library_import_dir is the same as for ftp_upload_dir allowing users to upload files via FTP and then import them either in history or data library. This assumes you have an FTP server running and configured to work with Galaxy.

Example structure with user_library_import_dir = ftp_upload.

ftp_upload/
└── marten@bx.psu.edu
    ├── 1.axt
    ├── 1.bam
    ├── 1.bed
    ├── 1.bed.spaces
    └── 1.fastqsanger

Path paste

By setting allow_path_paste = True in the config/galaxy.yml you enable administrators to import from any path entered (beginning with file://) in the "Paste/Fetch data" menu of the Galaxy Upload Manager. This feature is not available to non-admin users.

Permissions

When a data library is first created, it is considered "public" and it will be displayed in the "Data Libraries" view for any user, even anonymous. The Galaxy administrator can restrict access to a data library by associating roles with the data library's access library permission. This permission will conservatively override the (dataset) "access" permission for the data library's contained datasets.

For example, if a data library's "access library" permission is associated with Role1 and the data library contains public datasets, the data library will still only be displayed to those users that have Role1. However, if the data library's "access library" permission is associated with both Role1 and Role2 and the data library contains datasets whose [dataset] "access" permission is associated with only Role1, then users that have Role2 will be able to access the library, but will not see those contained datasets whose [dataset] "access" permission is associated with only Role1.

In addition to the "access library" permission, permission to perform the following functions on the data library (and its contents) can be granted to users (a library item is one of: a data library, a library folder, a library dataset).

  • add library item - Users that have the role can add library items to this data library or folder
  • modify library item - Users that have the role can modify this library item
  • manage library permissions - Users that have the role can manage permissions applied to this library item

The default behavior is for no permissions to be applied to a data library item, but applied permissions are inherited downward (with the exception of the "access library" permission, which is only available on the data library itself). Because of this, it is important to set desired permissions on a new data library when it is created. When this is done, new folders and datasets added to the data library will automatically inherit those permissions.

Additional resources

The Galaxy Administrative Docs have even more configuration help plus admin scripts to help manage data libraries.