Advances in the automated detection and recording of capture events from on-vessel video footage

Select | Print

Wang, Dadong ORCID ID icon; Tuck, Geoff ORCID ID icon; Little, Rich; Li, Ron


Conference Material

International Fisheries Observerand Monitoring Conference, Vigo, Spain, 11-15 June 2018


Sustainable fishing is both a mandated objective of modern fishery regulators and an expectation of our communities. On-vessel human observation can be expensive, dangerous, and sample only a small portion of the fishing activity. Increasingly, on-board cameras are replacing human observers on fishing vessels, resulting in large amounts of data that are analysed manually by human experts. This is still expensive and inefficient. The CSIRO Marine Visual Technologies (MVT) Team has designed and implemented an automated video-analysis system. This system integrates existing Deep Neural Network (DNN) frameworks for the analysis of e-monitoring videos as the backend, and presents the analysis results and reports with a Graphical User Interface (GUI) in a frontend. The backend includes functions for detecting fish in the videos and identifying their species. The GUI allows a human supervisor to view and audit the analysis results and make corrections if an error in fish or bycatch species is identified. Our system consists of a DNN module for the detection of fish in live-feed or recorded videos. A subsequent module classifies them into either a recognized fish species, an unrecognized fish, or a non-fish object. A third module summarises information generated by the upstream modules over the entirety of a fishing event. Initial training data were obtained from the Kaggle competition hosted by the Nature Conservancy. These training datasets were augmented via image resizing, rotation and reflection. This helps produce more training images covering various conditions such as different angles of fish, various sizes of fish of the same species and obscured images. In addition, commercial-in-confidence footage were acquired for our research. These were simultaneously recorded by two cameras, but with different frame rates. The detection, localisation and isolation of fish-like objects were performed by an adapted version of TensorBox, a proven network for object detection and tracking. It contains a GoogLeNet (a Convolutional Neural Network, or CNN) in the TensorFlow framework. We chose isotropic resampling and the Gaussian-noise filling for our transfer learning, video-based re-training, validation and testing. For monitoring operational compliance with recorded videos, the footage is partitioned into fishing events for higher level analysis. The MVT auto-detection software is at prototype stage and we are looking for opportunities to test and apply our software.


Electronic monitoring, machine learning

Environmental Monitoring

Funding Body NameProject/Grant ID
Australian Fisheries Management Authority


Conference Abstract


Wang, Dadong; Tuck, Geoff; Little, Rich; Li, Ron. Advances in the automated detection and recording of capture events from on-vessel video footage. In: International Fisheries Observerand Monitoring Conference; 11-15 June 2018; Vigo, Spain. CSIRO; 2018. NA.

Loading citation data...

Citation counts
(Requires subscription to view)