-
Swen Vermeul authoredSwen Vermeul authored
Welcome to pyBIS!
pyBIS is a Python module for interacting with openBIS, designed to be used in Jupyter. It offers some sort of IDE for openBIS, supporting TAB completition and input checks, making the life of a researcher hopefully easier.
Dependencies and Requirements
- pyBIS relies the openBIS API v3
- openBIS version 16.05.2 or newer is required
- 18.06.2 or later is recommended
- pyBIS uses Python 3.3 and pandas
Installation
pip install pybis
That command will download install pybis and all its dependencies.
If you haven't done yet, install Jupyter Notebook:
pip install jupyter
General Usage
Tab completition and other hints
Used in a Jupyter Notebook environment, pybis helps you to enter the commands. After every dot .
you might hit the TAB
key in order to look at the available commands.
If you are unsure what parameters to add to a , add a question mark right after the method and hit SHIFT+ENTER
. Jupyter will then look up the signature of the method and show some helpful docstring.
When working with properties of entities, they might use a controlled vocabulary or are of a specific property type. Add an underscore _
character right after the property and hit SHIFT+ENTER
to show the valid values. When a property only acceps a controlled vocabulary, you will be shown the valid terms in a nicely formatted table.
connect to OpenBIS
Interactivel, i.e. within a Jupyter notebook, you can use getpass
to enter your password:
from pybis import Openbis
o = Openbis('https://example.com', verify_certificates=False)
import getpass
password = getpass.getpass()
o.login('username', password, save_token=True) # save the session token in ~/.pybis/example.com.token
In a script you would rather use two environment variables to provide username and password:
from pybis import Openbis
o = Openbis(os.environ['OPENBIS_HOST'], verify_certificates=False)
o.login(os.environ['OPENBIS_USERNAME'], os.environ['OPENBIS_PASSWORD'])
Check whether the session token is still valid and log out:
o.token
o.is_session_active()
o.logout()
Masterdata
OpenBIS stores quite a lot of meta-data along with your dataSets. The collection of data that describes this meta-data (i.e. meta-meta-data) is called masterdata. It consists of:
- sample types
- dataSet types
- material types
- experiment types
- property types
- vocabularies
- vocabulary terms
- plugins (jython scripts that allow complex data checks)
- tags
- semantic annotations
browse masterdata
sample_types = o.get_sample_types() # get a list of sample types
sample_types.df # DataFrame object
st = o.get_sample_types()[3] # get 4th element of that list
st = o.get_sample_type('YEAST')
st.code
st.generatedCodePrefix
st.attrs.all() # get all attributes as a dict
st.get_validationPlugin() # returns a plugin object
st.get_property_assignments() # show the list of properties
# for that sample type
o.get_material_types()
o.get_dataset_types()
o.get_experiment_types()
o.get_property_types()
pt = o.get_property_type('BARCODE_COMPLEXITY_CHECKER')
pt.attrs.all()
o.get_plugins()
pl = o.get_plugin('Diff_time')
pl.script # the Jython script that processes this property
o.get_vocabularies()
o.get_vocabulary('BACTERIAL_ANTIBIOTIC_RESISTANCE')
o.get_terms(vocabulary='STORAGE')
o.get_tags()
create property types
Samples (objects), experiments (collections) and dataSets contain general attributes as well as type-specific properties. Before they can be assigned to their respective type, they need to be created first.
pt = o.new_property_type(
code = 'MY_NEW_PROPERTY_TYPE',
label = 'yet another property type',
description = 'my first property',
dataType = 'VARCHAR'
)
pt_voc = o.new_property_type(
code = 'MY_CONTROLLED_VOCABULARY',
label = 'label me',
description = 'give me a description',
dataType = 'CONTROLLEDVOCABULARY',
vocabulary = 'STORAGE'
)
The dataType
attribute can contain any of these values:
INTEGER
VARCHAR
MULTILINE_VARCHAR
REAL
TIMESTAMP
BOOLEAN
HYPERLINK
XML
CONTROLLEDVOCABULARY
MATERIAL
When choosing CONTROLLEDVOCABULARY
, you must specify a vocabulary
attribute (see example). Likewise, when choosing MATERIAL
, a materialType
attribute must be provided.
create sample types
sample_type = o.new_sample_type(
code = 'my_own_sample_type', # mandatory
generatedCodePrefix = 'S', # mandatory
description = '',
autoGeneratedCode = True,
subcodeUnique = False,
listable = True,
showContainer = False,
showParents = True,
showParentMetadata = False,
validationPlugin = 'Has_Parents' # see plugins below
)
sample_type.save()
assign properties to sample type
A sample type needs to be saved before properties can be assigned to. This assignment procedure applies to all entity types (dataset type, experiment type, material type).
sample_type.assign_property(
prop = 'diff_time', # mandatory
section = '',
ordinal = 5,
mandatory = True,
initialValueForExistingEntities = 'initial value'
showInEditView = True,
showRawValueInForms = True
)
sample_type.revoke_property('diff_time')
sample_type.get_property_assignments()
create dataset types
dataset_type = o.new_dataset_type(
code = 'my_dataset_type', # mandatory
description=None,
mainDataSetPattern=None,
mainDataSetPath=None,
disallowDeletion=False,
validationPlugin=None,
)
dataset_type.save()
dataset_type.assign_property('property_name')
dataset_type.revoke_property('property_name')
dataset_type.get_property_assignments()
create experiment types
experiment_type = o.new_experiment_type(
code,
description=None,
validationPlugin=None,
)
experiment_type.save()
experiment_type.assign_property('property_name')
experiment_type.revoke_property('property_name')
experiment_type.get_property_assignments()
create material types
material_type = o.new_material_type(
code,
description=None,
validationPlugin=None,
)
material_type.save()
material_type.assign_property('property_name')
material_type.revoke_property('property_name')
material_type.get_property_assignments()
create plugins
Plugins are Jython scripts that can accomplish more complex data-checks than ordinary types and vocabularies can achieve. They are assigned to entity types (dataset type, sample type etc). Documentation and examples can be found here
pl = o.new_plugin(
name ='my_new_entry_validation_plugin',
pluginType ='ENTITY_VALIDATION', # or 'DYNAMIC_PROPERTY' or 'MANAGED_PROPERTY',
entityKind = None, # or 'SAMPLE', 'MATERIAL', 'EXPERIMENT', 'DATA_SET'
script = 'def calculate(): pass' # a JYTHON script
)
pl.save()
Users, Groups and RoleAssignments
o.get_groups()
group = o.new_group(code='group_name', description='...')
group = o.get_group('group_name')
group.save()
group.assign_role(role='ADMIN', space='DEFAULT')
group.get_roles()
group.revoke_role(role='ADMIN', space='DEFAULT')
group.add_members(['admin'])
group.get_members()
group.del_members(['admin'])
group.delete()
o.get_persons()
person = o.new_person(userId='username')
person.space = 'USER_SPACE'
person.save()
person.assign_role(role='ADMIN', space='MY_SPACE')
person.assign_role(role='OBSERVER')
person.get_roles()
person.revoke_role(role='ADMIN', space='MY_SPACE')
person.revoke_role(role='OBSERVER')
o.get_role_assignments()
o.get_role_assignments(space='MY_SPACE')
o.get_role_assignments(group='MY_GROUP')
ra = o.get_role_assignment(techId)
ra.delete()
Spaces
space = o.new_space(code='space_name', description='')
space.save()
space.delete('reason for deletion')
o.get_spaces(
start_with = 1, # start_with and count
count = 7, # enable paging
)
space = o.get_space('MY_SPACE')
space.code
space.description
space.registrator
space.registrationDate
space.modifier
space.modificationDate
space.attrs.all() # returns a dict containing all attributes
Projects
project = o.new_project(
space=space,
code='project_name',
description='some project description'
)
project = space.new_project(
code='project_code',
description='project description'
)
project.save()
o.get_projects(
space = 'MY_SPACE', # show only projects in MY_SPACE
start_with = 1, # start_with and count
count = 7, # enable paging
)
o.get_projects(space='MY_SPACE')
space.get_projects()
project.get_experiments()
project.get_attachments()
p.add_attachment(fileName='testfile', description= 'another file', title= 'one more attachment')
project.download_attachments()
project.code
project.description
# ... and many more
project.attrs.all() # returns a dict containing all attributes
project.freeze = True
project.freezeForExperiments = True
project.freezeForSamples = True
Samples
Samples are nowadays called Objects in openBIS. pyBIS is not yet thoroughly supporting this term in all methods where «sample» occurs.
NOTE: In openBIS, samples
entities have recently been renamed to objects
. All methods have synonyms using the term object
, e.g. get_object
, new_object
, get_object_types
.
sample = o.new_sample(
type = 'YEAST',
space = 'MY_SPACE',
experiment = '/MY_SPACE/MY_PROJECT/EXPERIMENT_1',
parents = [parent_sample, '/MY_SPACE/YEA66'],
children = [child_sample],
props = {"name": "some name", "description": "something interesting"}
)
sample = space.new_sample( type='YEAST' )
sample.save()
sample = o.get_sample('/MY_SPACE/MY_SAMPLE_CODE')
sample = o.get_sample('20170518112808649-52')
sample.space
sample.code
sample.permId
sample.identifier
sample.type # once the sample type is defined, you cannot modify it
sample.space
sample.space = 'MY_OTHER_SPACE'
sample.experiment # a sample can belong to one experiment only
sample.experiment = '/MY_SPACE/MY_PROJECT/MY_EXPERIMENT'
sample.project
sample.project = '/MY_SPACE/MY_PROJECT' # only works if project samples are
enabled
sample.tags
sample.tags = ['guten_tag', 'zahl_tag' ]
sample.attrs.all() # returns all attributes as a dict
sample.props.all() # returns all properties as a dict
sample.get_attachments()
sample.download_attachments()
sample.add_attachment('testfile.xls')
parents, children, components and container
sample.get_parents()
sample.set_parents(['/MY_SPACE/PARENT_SAMPLE_NAME')
sample.add_parents('/MY_SPACE/PARENT_SAMPLE_NAME')
sample.del_parents('/MY_SPACE/PARENT_SAMPLE_NAME')
sample.get_children()
sample.set_children('/MY_SPACE/CHILD_SAMPLE_NAME')
sample.add_children('/MY_SPACE/CHILD_SAMPLE_NAME')
sample.del_children('/MY_SPACE/CHILD_SAMPLE_NAME')
# A Sample may belong to another Sample, which acts as a container.
# As opposed to DataSets, a Sample may only belong to one container.
sample.container # returns a sample object
sample.container = '/MY_SPACE/CONTAINER_SAMPLE_NAME' # watch out, this will change the identifier of the sample to:
# /MY_SPACE/CONTAINER_SAMPLE_NAME:SAMPLE_NAME
sample.container = '' # this will remove the container.
# A Sample may contain other Samples, in order to act like a container (see above)
# The Sample-objects inside that Sample are called «components» or «contained Samples»
# You may also use the xxx_contained() functions, which are just aliases.
sample.get_components()
sample.set_components('/MY_SPACE/COMPONENT_NAME')
sample.add_components('/MY_SPACE/COMPONENT_NAME')
sample.del_components('/MY_SPACE/COMPONENT_NAME')
sample tags
sample.get_tags()
sample.set_tags('tag1')
sample.add_tags(['tag2','tag3'])
sample.del_tags('tag1')
useful tricks when dealing with properties, using Jupyter or IPython
sample.p + TAB # in IPython or Jupyter: show list of available properties
sample.p.my_property_ + TAB # in IPython or Jupyter: show datatype or controlled vocabulary
sample.p['my-weird.property-name'] # accessing properties containing a dash or a dot
sample.set_props({ ... }) # set properties by providing a dict
sample.p # same thing as .props
sample.p.my_property = "some value" # set the value of a property
# value is checked (type/vocabulary)
sample.save() # update the sample in openBIS
querying samples
samples = o.get_samples(
space ='MY_SPACE',
type ='YEAST',
tags =['*'], # only sample with existing tags
start_with = 1, # start_with and count
count = 7, # enable paging
NAME = 'some name', # properties are always uppercase
# to distinguish them from attributes
**{ "SOME.WEIRD:PROP": "value"} # property name contains a dot or a
# colon: cannot be passed as an argument
props=['NAME', 'MATING_TYPE'] # show these properties in the result
)
samples.df # returns a pandas DataFrame object
samples.get_datasets(type='ANALYZED_DATA')
freezing samples
sample.freeze = True
sample.freezeForComponents = True
sample.freezeForChildren = True
sample.freezeForParents = True
sample.freezeForDataSets = True
Experiments
NOTE: In openBIS, experiment
entities have recently been renamed to collection
. All methods have synonyms using the term collection
, e.g. get_collections
, new_collection
, get_collection_types
.
exp = o.new_experiment
type='DEFAULT_EXPERIMENT',
space='MY_SPACE',
project='YEASTS'
)
exp.save()
o.get_experiments(
project='YEASTS',
space='MY_SPACE',
type='DEFAULT_EXPERIMENT',
tags='*',
finished_flag=False,
props=['name', 'finished_flag']
)
project.get_experiments()
exp = o.get_experiment('/MY_SPACE/MY_PROJECT/MY_EXPERIMENT')
exp.set_props({ key: value})
exp.props
exp.p # same thing as .props
exp.p.finished_flag=True
exp.p.my_property = "some value" # set the value of a property (value is checked)
exp.p + TAB # in IPython or Jupyter: show list of available properties
exp.p.my_property_ + TAB # in IPython or Jupyter: show datatype or controlled vocabulary
exp.p['my-weird.property-name'] # accessing properties containing a dash or a dot
exp.attrs.all() # returns all attributes as a dict
exp.props.all() # returns all properties as a dict
exp.attrs.tags = ['some', 'tags']
exp.tags = ['some', 'tags'] # same thing
exp.save()
exp.code
exp.description
exp.registrator
exp.registrationDate
exp.modifier
exp.modificationDate
exp.freeze = True
exp.freezeForDataSets = True
exp.freezeForSamples = True
Datasets
working with existing dataSets
sample.get_datasets()
ds = o.get_dataset('20160719143426517-259')
ds.get_parents()
ds.get_children()
ds.sample
ds.experiment
ds.physicalData
ds.status # AVAILABLE LOCKED ARCHIVED
# ARCHIVE_PENDING UNARCHIVE_PENDING
# BACKUP_PENDING
ds.archive()
ds.unarchive()
ds.attrs.all() # returns all attributes as a dict
ds.props.all() # returns all properties as a dict
ds.add_attachment() # attachments usually contain meta-data
ds.get_attachments() # about the dataSet, not the data itself.
ds.download_attachments()
download dataSets
ds.get_files(start_folder="/") # get file list as pandas table
ds.file_list # get file list as array
ds.download() # simply download all files to hostname/permId/
ds.download(
destination = 'my_data', # download files to folder my_data/
create_default_folders = False,# ignore the /original/DEFAULT folders made by openBIS
wait_until_finished = False, # download in background, continue immediately
workers = 10 # 10 downloads parallel (default)
)
dataSet attributes and properties
ds.set_props({ key: value})
ds.props
ds.p # same thing as .props
ds.p.my_property = "some value" # set the value of a property
ds.p + TAB # show list of available properties
ds.p.my_property_ + TAB # show datatype or controlled vocabulary
ds.p['my-weird.property-name'] # accessing properties containing a dash or a dot
ds.attrs.all() # returns all attributes as a dict
ds.props.all() # returns all properties as a dict
querying dataSets
- examples of a complex queries with methods chaining.
- NOTE: properties must be in UPPERCASE to distinguish them from attributes
datasets = o.get_experiments(project='YEASTS')\
.get_samples(type='FLY')\
.get_datasets(
type='ANALYZED_DATA',
props=['MY_PROPERTY'],
MY_PROPERTY='some analyzed data'
)
datasets = o.get_experiment('/MY_NEW_SPACE/MY_PROJECT/MY_EXPERIMENT4')\
.get_samples(type='UNKNOWN')\
.get_parents()\
.get_datasets(type='RAW_DATA')