README update

This commit is contained in:
Till Raab 2023-10-27 13:12:24 +02:00
parent 3dbc78685d
commit e23301c0d1

View File

@ -6,6 +6,10 @@ spectrogram images. The model itself is a pretrained **FasterRCNN** Model with a
**ResNet50** Backbone. Only the final predictor is replaced to not predict the 91 classes
present in the coco-dataset the model is trained to but the (currently) 1 category it should detect.
## Long-Term and major ToDos:
* implement gui to correct bounding boxes
* implement reinforced learning
## Data preparation
### Data structure
The algorithm learns patterns based on **.png-images** and corresponding bounding boxes
@ -24,7 +28,6 @@ Use the script **./data/train_test_split.py** to split the original .csv file in
training and one for testing (both also stored in ./data/dataset).
### ToDos:
* FIX: name of generated png images. HINT: {XXX:6.0f}.replace(' ', '0')
* on a long scale: only save raw file bounding boxes in frequency and time (t0, t1, f0, f1) and the hyperparameters of the corresponding spectrogram. USE THESE PARAMETERS IN DATASET_FN.
## model.py
@ -51,13 +54,13 @@ im2 = ImageOps.grayscale(im1)
* check other pretrained models from torchvision.models.detection, e.g. fasterrcnn_resnet50_fpn_v2
## dataset.py
Contains custom datasets and dataloader. These are based on the images that are stored in
./data/dataset.
## datasets.py
Contains custom datasets and dataloader functions and classes.
### ToDos:
* load/compute spectrogram directly and perform signal detection. E.g. spectrogram calculation as part of __getitem__
## config.py
Containes Hyperparameters used by the scripts.
@ -77,14 +80,14 @@ gradient tracking) is computed and used to infer whether the model is better tha
of the previous epochs. If the new model is the best model, the model.state_dict is saved in
./model_outputs as best_model.pth.
After training a final validation is performed where images showing both, the predicted and the
true bounding boxes in an image. Images are stored in ./inference_output/dataset.
### ToDos:
## inference.py
Currently, this code performs predictions based in the test dataset (img and corresponding csv file).
However, this code shall be used to infer totally unknown images. Prediction results are ilustrated
and stored in ./inference_output
Script is used to infere unknown .png images. Results are stored in ./inference_output/<Dataset-Name>
### ToDo:
* implement path where no csv file is needed...