README update
This commit is contained in:
parent
3dbc78685d
commit
e23301c0d1
19
README.md
19
README.md
@ -6,6 +6,10 @@ spectrogram images. The model itself is a pretrained **FasterRCNN** Model with a
|
|||||||
**ResNet50** Backbone. Only the final predictor is replaced to not predict the 91 classes
|
**ResNet50** Backbone. Only the final predictor is replaced to not predict the 91 classes
|
||||||
present in the coco-dataset the model is trained to but the (currently) 1 category it should detect.
|
present in the coco-dataset the model is trained to but the (currently) 1 category it should detect.
|
||||||
|
|
||||||
|
## Long-Term and major ToDos:
|
||||||
|
* implement gui to correct bounding boxes
|
||||||
|
* implement reinforced learning
|
||||||
|
|
||||||
## Data preparation
|
## Data preparation
|
||||||
### Data structure
|
### Data structure
|
||||||
The algorithm learns patterns based on **.png-images** and corresponding bounding boxes
|
The algorithm learns patterns based on **.png-images** and corresponding bounding boxes
|
||||||
@ -24,7 +28,6 @@ Use the script **./data/train_test_split.py** to split the original .csv file in
|
|||||||
training and one for testing (both also stored in ./data/dataset).
|
training and one for testing (both also stored in ./data/dataset).
|
||||||
|
|
||||||
### ToDos:
|
### ToDos:
|
||||||
* FIX: name of generated png images. HINT: {XXX:6.0f}.replace(' ', '0')
|
|
||||||
* on a long scale: only save raw file bounding boxes in frequency and time (t0, t1, f0, f1) and the hyperparameters of the corresponding spectrogram. USE THESE PARAMETERS IN DATASET_FN.
|
* on a long scale: only save raw file bounding boxes in frequency and time (t0, t1, f0, f1) and the hyperparameters of the corresponding spectrogram. USE THESE PARAMETERS IN DATASET_FN.
|
||||||
|
|
||||||
## model.py
|
## model.py
|
||||||
@ -51,13 +54,13 @@ im2 = ImageOps.grayscale(im1)
|
|||||||
|
|
||||||
* check other pretrained models from torchvision.models.detection, e.g. fasterrcnn_resnet50_fpn_v2
|
* check other pretrained models from torchvision.models.detection, e.g. fasterrcnn_resnet50_fpn_v2
|
||||||
|
|
||||||
## dataset.py
|
## datasets.py
|
||||||
Contains custom datasets and dataloader. These are based on the images that are stored in
|
Contains custom datasets and dataloader functions and classes.
|
||||||
./data/dataset.
|
|
||||||
|
|
||||||
### ToDos:
|
### ToDos:
|
||||||
* load/compute spectrogram directly and perform signal detection. E.g. spectrogram calculation as part of __getitem__
|
* load/compute spectrogram directly and perform signal detection. E.g. spectrogram calculation as part of __getitem__
|
||||||
|
|
||||||
|
|
||||||
## config.py
|
## config.py
|
||||||
Containes Hyperparameters used by the scripts.
|
Containes Hyperparameters used by the scripts.
|
||||||
|
|
||||||
@ -77,14 +80,14 @@ gradient tracking) is computed and used to infer whether the model is better tha
|
|||||||
of the previous epochs. If the new model is the best model, the model.state_dict is saved in
|
of the previous epochs. If the new model is the best model, the model.state_dict is saved in
|
||||||
./model_outputs as best_model.pth.
|
./model_outputs as best_model.pth.
|
||||||
|
|
||||||
|
After training a final validation is performed where images showing both, the predicted and the
|
||||||
|
true bounding boxes in an image. Images are stored in ./inference_output/dataset.
|
||||||
|
|
||||||
### ToDos:
|
### ToDos:
|
||||||
|
|
||||||
## inference.py
|
## inference.py
|
||||||
Currently, this code performs predictions based in the test dataset (img and corresponding csv file).
|
Script is used to infere unknown .png images. Results are stored in ./inference_output/<Dataset-Name>
|
||||||
However, this code shall be used to infer totally unknown images. Prediction results are ilustrated
|
|
||||||
and stored in ./inference_output
|
|
||||||
|
|
||||||
### ToDo:
|
### ToDo:
|
||||||
* implement path where no csv file is needed...
|
|
||||||
|
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user