Go to file
2023-10-26 11:59:56 +02:00
data working on the dataset. now i will rework the dataloader to load test and train data based on the corresponding csv files 2023-10-25 15:04:09 +02:00
.gitignore a 2023-10-20 08:34:38 +02:00
confic.py check image save process 2023-10-26 10:06:55 +02:00
custom_utils.py looks better ... write inference code and feed the model 2023-10-24 09:17:31 +02:00
datasets.py check if i can leave image name in target as return from dataset ... to(DEVICE) problematic 2023-10-26 09:25:02 +02:00
inference.py bf 2023-10-26 10:20:19 +02:00
model.py something works !!! 2023-10-24 08:56:35 +02:00
README.md README with ToDos implemented 2023-10-26 11:59:56 +02:00
train.py small fix 2023-10-26 11:56:14 +02:00

efishSignalDetector

Welcome to the efishSignalDetector, a neural network framework adapted for detecting electrocommunication signals in wavetype electric fish based on spectrogram images. The model itself is a pretrained FasterRCNN Model with a ResNet50 Backbone. Only the final predictor is replaced to not predict 91 classes the coco-dataset it is trained to but the (currently) 1 category it should detect.

Data preparation

Data structure

The algorithm learns patterns based on .png-images and corresponding bounding boxes which are stored in .csv-files.

  • image name includes the file it is derived form as well as time and frequency bounds
  • image size defined in config.py with size (IMG_SIZE=(7, 7)) and dpi (IMG_DPI=256).
  • .cvs file where each row represents one assigned signal:
    • image: image name
    • x0, x1, y0, y1: image coordinates of bounding box
    • t0, t1, f0, f1: time and frequency boinding box
  • .png images and .csv file is stored in ./data/dataset

Test-train-split

Use the script ./data/train_test_split.py to split the original .csv file into one for training and one for testing (both also stored in ./data/dataset).

ToDos:

  • FIX: name of generated png images. HINT {XXX:6.0f}.replace(' ', '0')
  • transfere images from ./data/train to ./data/dataset

model.py

Contains the model and adjustements.

ToDos:

  • replace backbone entry to not take RGB input, but grayscale images or even spectrograms.
# Hint:
model.backbone.body.conv1 = torch.nn.Conv2d(1, 64,
                            kernel_size=(7, 7), stride=(2, 2),
                            padding=(3, 3), bias=False).requires_grad_(True)
# Hint:
from PIL import Image, ImageOps   

im1 = Image.open(img_path) 
im2 = ImageOps.grayscale(im1) 
  • check other pretrained models from torchvision.models.detection, e.g. fasterrcnn_resnet50_fpn_v2

config.py

Containes Hyperparameters used by the scripts.

ToDos:

  • replace TRAIN_DIR with DATA_DIR everywhere !!!

custom_utils.py

Classes and functions to save models and store loss values for later illustration. Also includes helper functions...

ToDos:

train.py

Code training the model using the stored images in ./data/dataset and the .csv files containing the bounding boxes meant for training. For each epoch test-loss (without gradient tracking) is computed and used to infer whether the model is better than the one of the previous epochs. If the new model is the best model, the model.state_dict is saved in ./model_outputs as best_model.pth.

ToDos:

inference.py

Currently, this code performs predictions based in the test dataset (img and corresponding csv file). However, this code shall be used to infer totally unknown images. Prediction results are ilustrated and stored in ./inference_output

ToDo:

  • implement path where no csv file is needed...