From e23301c0d184d0695a9d735eda573affcd972d0e Mon Sep 17 00:00:00 2001 From: Till Raab Date: Fri, 27 Oct 2023 13:12:24 +0200 Subject: [PATCH] README update --- README.md | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index e15edb3..a7333d6 100644 --- a/README.md +++ b/README.md @@ -6,6 +6,10 @@ spectrogram images. The model itself is a pretrained **FasterRCNN** Model with a **ResNet50** Backbone. Only the final predictor is replaced to not predict the 91 classes present in the coco-dataset the model is trained to but the (currently) 1 category it should detect. +## Long-Term and major ToDos: +* implement gui to correct bounding boxes +* implement reinforced learning + ## Data preparation ### Data structure The algorithm learns patterns based on **.png-images** and corresponding bounding boxes @@ -24,7 +28,6 @@ Use the script **./data/train_test_split.py** to split the original .csv file in training and one for testing (both also stored in ./data/dataset). ### ToDos: -* FIX: name of generated png images. HINT: {XXX:6.0f}.replace(' ', '0') * on a long scale: only save raw file bounding boxes in frequency and time (t0, t1, f0, f1) and the hyperparameters of the corresponding spectrogram. USE THESE PARAMETERS IN DATASET_FN. ## model.py @@ -51,13 +54,13 @@ im2 = ImageOps.grayscale(im1) * check other pretrained models from torchvision.models.detection, e.g. fasterrcnn_resnet50_fpn_v2 -## dataset.py -Contains custom datasets and dataloader. These are based on the images that are stored in -./data/dataset. +## datasets.py +Contains custom datasets and dataloader functions and classes. ### ToDos: * load/compute spectrogram directly and perform signal detection. E.g. spectrogram calculation as part of __getitem__ + ## config.py Containes Hyperparameters used by the scripts. @@ -77,14 +80,14 @@ gradient tracking) is computed and used to infer whether the model is better tha of the previous epochs. If the new model is the best model, the model.state_dict is saved in ./model_outputs as best_model.pth. +After training a final validation is performed where images showing both, the predicted and the +true bounding boxes in an image. Images are stored in ./inference_output/dataset. + ### ToDos: ## inference.py -Currently, this code performs predictions based in the test dataset (img and corresponding csv file). -However, this code shall be used to infer totally unknown images. Prediction results are ilustrated -and stored in ./inference_output +Script is used to infere unknown .png images. Results are stored in ./inference_output/ ### ToDo: -* implement path where no csv file is needed...