[docs] more docs and getting started

This commit is contained in:
Jan Grewe 2020-08-03 18:25:47 +02:00
parent 9d61ca7a1c
commit b376edb342
2 changed files with 146 additions and 7 deletions

View File

@ -1,9 +1,16 @@
# Configuration # Project configuration
## Database connection ## Database connection
* Should work with mysql or mariaDB databases * If you do not have access to a remote database server or want to run it locally, go to [index](index.md) for information on setting up the database server.
* if you do not have access to a database server, go back to index.md to read how to install one locally
Suppose that you intend to work on a project named `foo` use the terminal to navigate to the project directory:
```bash
cd foo
```
To log into the database you need the `host`, the `database name`, the `user` name, and the `password`. To log into the database you need the `host`, the `database name`, the `user` name, and the `password`.
Note, the selected user needs at least read access on the selected database (True? maybe also write?). Note, the selected user needs at least read access on the selected database (True? maybe also write?).
@ -47,4 +54,4 @@ Let's assume you are working remotely, i.e. connecting to a remote database on a
In case you are working with a local database the ```database.host``` entry is ```"localhost"```. In case you are working with a local database the ```database.host``` entry is ```"localhost"```.
If you ommit ```database.user``` or ```database.password``` fishbook, respectively datajoint, will ask you for the user credentials every time you run your python scripts. If you ommit ```database.user``` or ```database.password``` fishbook, respectively datajoint, will ask you for the user credentials every time you run your python scripts.
**Note:** This configuration file is pure text and can be easily read by anyone. It **must not** be part of a public repository. **Do not add it to your version control system!** **IMPORTANT:** This configuration file is pure text and can be easily read by anyone. It **must not** be part of a public repository. **Do not add it to your version control system!**

View File

@ -19,12 +19,144 @@ Fishbook has the following submodules:
import fishbook as fb import fishbook as fb
# the following command will find and load all (!) dataset entries in the database # the following command will find and load all (!) dataset entries in the database
datasets = fb.Dataset.find() datasets, count = fb.Dataset.find()
print(len(datasets)) print(count)
# or use # or use
print(fb.Dataset.datasetCount()) print(fb.Dataset.datasetCount())
```
### Finding what you might look for
For some classes, fetching and converting all matches to python objects of the *Dataset* class may take a while. You could then use a loop to iterate over the list of datasets and apply filters in python. This is fine, but has poor performance, better we use the underlying database for such restrictions.
Again using the *Dataset* class restrictions can be done like this:
```python
# to test your restrictions you may want to set the test==True flag
_, count = fb.Dataset.find(quality="good", min_date="2018-01-01", max_date="2018-12-31", test=True)
print(count)
# if you indeed want to fetch the results
datasets, count = fb.Dataset.find(quality="good", min_date="2018-01-01", max_date="2018-12-31")
print(len(datasets))
```
You can use the dataset class as an entry point to explore what is stored in the database about this dataset, e.g. wich cell(s) was recorded and the subject infromation.
```python
d = datasets[0]
print(d)
cells = d.cells # get the cells recorded in this dataset
cell = cells[0]
print(cell)
s = cell.subject
print(s)
# in a given subject we may record several cells
print(len(s.cells))
```
### Which RePros were run in a given dataset?
In a dataset we may run several repros:
```python
repros = d.repro_runs()
for r in repros:
print(r)
```
in this example this leads to an output like, this. Two **Re**search **Pro**tocols were run, this first the "BaselineActivity" repro and the second the "ReceptiveField" repro.
``` bash
RePro: BaselineActivity id: BaselineActivity_0
run: 0 on cell: 2018-01-10-ag
start time: 0.0 duration: 40.0796
RePro: ReceptiveField id: ReceptiveField_0
run: 0 on cell: 2018-01-10-ag
start time: 40.0796 duration: 236.42
```
### Getting information about the stimuli
The latter puts out a set of stimuli we may be interested in:
```python
r = d.repro_runs()[-1]
stimuli = r.stimuli
print(len(stimuli)) # 90 in this example case
for stim in stimuli[:10]: # let's print the first 10
print(stim)
```
An entry in the Stimuli table (see [ER-schema](database_layout.md) for details) is characterized by the stimulus id, the start time or start index (depends on the data source) and a stimulus duration. The stimulus also carries a textfield with the stimulus settings.
```python
stim.settings
>> 'ReceptiveField-1:\n\tReceptiveField-1:\n\t\tModality: electric\n\t\tSamplingRate: 37.59398496240602+- -1.000\n\t\tdur: 0.0+- -1.000\n\t\tampl: 0.0+- -1.000\n\t\tdeltaf: 0.0+- -1.000\n\t\tfreq: 0.0+- -1.000\n\t\tx_pos: 0.0+- -1.000\n\t\ty_pos: 0.0+- -1.000\n\t\tz_pos: 0.0+- -1.000\n\tEOD Rate: 826.6917939137279 Hz\n\tEOD Amplitude: 0.4263797848004206 mV\n\tGlobalEField: 0.0 V\n\tGlobalEFieldAM: 0.0 V\n\tLocalEField: 0.0 V\n\tI: 0.0 V\n\ttime: 294.9933 s\n\tdelay: 0.0 s\n\tintensity: 0.039810717055349734 mV/cm\n\tdur: 1.0 \n\tampl: 0.04 \n\tdeltaf: 20.0 \n\tfreq: 854.7108299738921 \n\tx_pos: 142.5 \n\ty_pos: 2.0 \n\tz_pos: -15.0 \n'
# to convert it to yaml
import yaml
settings = yaml.safe_load(stim.settings.replace("\t", ""))
print(settings)
```
The output:
```shell
{'ReceptiveField-1': None,
'Modality': 'electric',
'SamplingRate': '37.59398496240602+- -1.000',
'dur': 1.0,
'ampl': 0.04,
'deltaf': 20.0,
'freq': 854.7108299738921,
'x_pos': 142.5,
'y_pos': 2.0,
'z_pos': -15.0,
'EOD Rate': '826.6917939137279 Hz',
'EOD Amplitude': '0.4263797848004206 mV',
'GlobalEField': '0.0 V',
'GlobalEFieldAM': '0.0 V',
'LocalEField': '0.0 V',
'I': '0.0 V',
'time': '294.9933 s',
'delay': '0.0 s',
'intensity': '0.039810717055349734 mV/cm'}
```
### Finding the data
The *Stimulus* in conjunction with the *Dataset* provides all infromation needed to get the data from the recorded traces.
The dataset can be located from the *Dataset* class:
```python
print(d.data_source) # the absolute path from which the data was imported
print(d.data_host) # the fully qaulified host name
print(d.has_nix) # informs whether data is stored in a nix file.
```
Let's assume you have access to the dataset and it uses a [NIX](https://g-node.com/nix) file:
```python
import os
import glob
import nixio as nix
path = d.data_source
assert(os.path.exists(path))
if (d.has_nix):
nix_file = glob.glob(path + os.sep + "*nix")[0] # there should only be one nix file
mtag_id = stim.multi_tag_id
position_index = stim.index
nf = nix.File.open(nix_file, nix.FileMode.ReadOnly)
block = nf.blocks[0] # there is only a single block in relacs written files
mt = block.multi_tags[mtag_id]
data = mt.retrieve_data(position_index, "V-1")[:] # we are interested in the membrane voltage
nf.close()
print(data.shape)
``` ```
The Dataset is an entrypoint to the database