Skip to content

Toward Best Practices to Develop, Reuse, and Share Code

For information on the DCAN Lab standards for writing code, please visit hackmd.

The Typical Workflow

  • A researcher starts with an idea
  • The researcher develops a working prototype that works and leads to cool results
  • Then someone else asks to use the code

Challenges

  • Code is not fully developed/bullet proofed.
  • Poor documentation
  • Hard-coded paths
  • Not designed to be re-used

In an Ideal World

  • Code should be fully documented
  • Including description
  • Example
  • Data
  • Should be able to run on any platform
  • Code should be modularized
  • Modular elements should be able to be reused
  • Final code should be containerized

The New Workflow

  • Describe the goal of the project
  • Define inputs and outputs
  • Make a diagram as granular as possible
  • Reuse existing components

Components

Key elements are listed. In parenthesis it is also indicated a suggested color coding schema for flow diagrams.

  • Code base (Black). Code developed and maintained by external users or by a member of the DCAN labs. Each code base should have a steward1 that will make sure the code is kept under version control, updated, that issues and features requests are properly managed. Steward should also make sure compatibility is preserved for all the applications that use each code base.
  • Language: This could be bash, matlab, python, R, etc
  • Inputs
    • Numerical matrices
    • Tables with demographics, covariates
  • Outputs
    • Numerical matrices
    • Tables
  • Settings
    • Different ways to run the code
  • Interfaces (Orange outline). These are a set of tools that can reshape-reformat neuroimaging or demographics data to be used by any code base. There are also interfaces to convert back matrices to neuroimaging data.
  • Neuroimaging to numerical matrices
  • Numerical matrices to neuroimaging data
  • Standard ways to read demographics
  • Visualizers (Purple outline). Those are packages able to take standardly formatted demographic and neuroimaging data and make high-quality figures
  • Markdown-formatters (Blue outline). Code to take tables and make markdown files

1CodeStewardship is an alternative to CodeOwnership, emphasizing that code is the team's property, and not the sole province of any one person. A team member is granted stewardship over a piece of code. The steward has primary responsibility for the code's "care and feeding," with input and guidance from the community. The steward normally makes all changes to the code, though trusted members of the team may make changes that the steward is then responsible for vetting. For more information. visit Code Stewardship.

Case Example

We like to use community detection to identify individualized ROIs given a connectivity matrix. While there is code to do it, we can not use the code as is in another system for several reasons.

Case Example

Proposed Elements for Flow Diagrams

The American National Standards Institute (ANSI) set standards for flowcharts and their symbols in the 1960s.The International Organization for Standardization (ISO) adopted the ANSI symbols in 1970. The current standard, ISO 5807, was revised in 1985. Generally, flowcharts flow from top to bottom and left to right. Reference

elements

[14] [15]

Other Symbols

The ANSI/ISO standards include symbols beyond basic shapes. Some are:

symbols

Process for Implementing the Workflow

  • Agree on the proposal
  • Use a standard dictionary to define variables,
  • Robustify existing components
  • Have code under version control
  • Work on documentation

Resources for Creating Visual Diagrams

Note: some of these resources have a limit on the number of documents allowed before a subscription is required.

Lucidchart

Draw.io

PlantUML

Creately

Cacoo

Miro