Skip to content
This repository was archived by the owner on Aug 14, 2019. It is now read-only.

Latest commit

 

History

History
426 lines (328 loc) · 13.1 KB

MorphoCut.md

File metadata and controls

426 lines (328 loc) · 13.1 KB

Lead Eagle

Generic image processing pipeline.

Here we propose the self-learning Helmholtz Imaging Pipeline (sHIP) for collaborative online processing of biological, medical, oceanographic and remote sensing imagery. Image processing often consists of three basic processing steps: 1) segmentation of one or several regions of interest (ROI) from the remaining image, 2) quantification of certain characteristics such as the size, shape, color or texture of a given ROI and 3) classification of a given ROI.

Inspiration

File formats

Tiling of large images

Zoom Number of tiles
0 1 Whole image in one tile.
1 4
2 8

Backend

Frontend

Attention: Make sure to include Leaflet css.

Modules

Data

ImageBlob

  • ImageBlobID
  • Path
  • Metadata
  • Axes order (x, y, z, channels, time, ...)
  • Axes resolution / meaning
  • File type

Object

  • ObjectID
  • ImageBlobID
  • Bounding Box (PostGIS)
  • Metadata
  • Mask (1bit PNG)

Task

One of the modules (import/export/calculations/...)

  • TaskID
  • Parameters
  • Type

Long-running jobs

Everything that takes more than 0.5s-1s.

Misc

Paths (=Blueprints)

  • /blobs/<imgblobid>: Raw blob (might be large)
  • /blobs/<imgblobid>/tiles/<z>/<x>/<y>: Tiles of a large image
  • /masks/<objid>.png?color=<color>: Mask for a specific object (for overlay)

Manual Segmentation

  • Annotations:
    • Skeleton: Connect inner parts of an object (certainly foreground).
    • Outline: Draw the object boundaries (certainly background).
    • Point: Designate an object.
  • Application cases:
    • Connect oversegmented parts of an object: Draw skeleton lines connecting the individual segments.
    • Split touching objects: Place markers on the respective object centers. The segment pixels will be assigned to object with the nearest annotation.
    • Separate overlapping objects: After placing markers, draw skeleton lines of each object. The segment pixels will be assigned to object with the nearest annotation. Refine with object outlines.

Environment

  • Python 3.7

Module-UI-interaction

A processing node provides a data structure that completely describes the required configuration settings.

API

Database

Computation Graph

  • Model
    • NodeClass (class_name, FQN)
    • NodeInstance (ID, name, class, config)
      • config: JSON object of configuration values
    • Slots (node_id, name, [dtype])
    • Edge (from ID, from slot, to ID, to slot)
    • Object (project_id, parent_id)
    • Facet (node id, object id, slot id, meta)
      • FK: node id, object id, slot id
      • meta: metadata
    • ImageData (url, data)
      • FK: facet_id / node id, object id, slot id
      • URL: URL to image file on disk
      • data: Byte string of image data (e.g. PNG) (Optional)
      • Bounding Box: Bounding box in the on disk file to select a subarea

Intra-node computation

  • One node may consist of multiple chained operations.

  • The Elements of the chain pass data in dicts

  • Implemented with generators

    seq = Sequential([DataLoader(path, ...), Processor(params...), Export(path)])
    seq.process()
  • Data loader class: ~ import_data

    • Reads files, loads images

    • Outputs single image as object

      {
          object_id: ...
          [create: True,]
          facets: {
          	# For DataLoader
              input_data: {
                  meta: {filename: ...},
                  image: <np.array of shape = [h,w,c]>,
                  [create: True,]
              },
              # For Processor
              raw_img: {
                  meta: {region props...},
                  image: <np.array of shape = [h,w,c]>
              },
              contour_img: {
                  meta: {},
                  image: <np.array of shape = [h,w,c]>
              },
             	# Nothing for export
          }
      }
      
  • Vignetting correction: __init__(self, input_facet: str, output_facet: str, [params...])

  • Image Segmentation class: ~ process_single_image __init__(self, input_facet: str, output_facet: str, min_object_area=None, padding=None, ...)

    • Iterates over values of data generator
    • Outputs single ROI as object
  • Ecotaxa export class: ~ export_data

    __init__(self, input_facet: str, output_facet: str, output_path, ...)

    • Iterates over values of segmentation
    • Writes ZIP
  • Database Persistence class

    • Reads objects dict
    • Creates object (if necessary, create: True)
    • Creates corresponding facets (if necessary, create: True)
    • Creates image files, stores image_url
    {
       	object_id: ...
        create: true,
        facets: {
        	# For DataLoader
            input_data: {
                meta: {filename: ...},
                image: <np.array of shape = [h,w,c]>,
                create: true,
                image_url: ..., # When file is available on disk
            },
        }
    }`
    

Full Pipeline

Pipeline([
    DataLoader(import_path, output_facette="raw"),
    VignetteCorrector(input_facette="raw", output_facette="color"),
    RGB2Gray(input_facette="color", output_facette="gray"),
    ThresholdOtsu(input_facette="gray", output_facette="mask"),
    ExtractRegions(
    	mask_facette="mask",
    	image_facettes=["color", "gray"],
    	output_facet="features",
    	padding=10),
    FadeBackground(input_facette="gray", output_facette="gray_faded", alpha=0.5, color=1.0),
    DrawContours(input_facette="color", output_facette="color_countours"),
    Exporter(
    	"/path/to/archive.zip",
    	prop_facettes=["features"],
    	img_facettes=["gray_faded", "color", "color_contours"])
])

Database persistence of objects

  • After Data loader

  • After Processing

  • DatasetReader: 
    Converts files to objects
    
    for i in index:
    yield
    {
       	object_id: ...
        create: true,
        facets: {
            raw_image: {
                meta: {None},
                image: Image(files[i]),
                create: true,
            },
        }
    }
    
    When processing:
    
    for r in regions:
    yield
    {
       	object_id: parent_id + idx
        create: true,
        parent_id: parent_id,
        parent_bbox: ...
        facets: {
            mask: {
                image: Image,
            },
        }
    }
    
    
    Image class:
    	Image(filename, bbox=None)
    	get() -> np.ndarray
    

Classification

  • Rainer: some objects are vignetted several times, because there is a big one that contains several small ones.

    Is this really the case? Then: https://docs.opencv.org/3.4/d9/d8b/tutorial_py_contours_hierarchy.html

    Fred: Only the large one should be kept, although some smaller ones could be extracted as well. This is typical from a fecal aggregate for instance (a lager phytoplankton aggregates containing small fecal pellets and/or fragments a larger fecal pellets). This as such constitutes a category. I would keep them as it is. However, if it is possible to get info on what sort of FP are included in them, it's obviously relevant information.

  • Categories:

  • Phytoplankton aggregates

  • Fecal aggregates

  • single cells (if any...)

  • Cylindrical FP

  • Ovoid FP

  • unclear (we could put in this category what we are unsure off for now, and decide later together)

Authorization

  • First step: Projects are owned by user. Only user may see data
  • Later: Allow shared access to projects

Data Storage

  • Data is owned by project
  • Storage location: <root>/<user_id>/<project_id>/<node_id>/
  • Not in static!

Image dimensions

  • Images are arrays of shape (h,w,c)
  • We use the openCV color channel order (BGR).