CW22 - 2022-04-04

Collaborative Ideas session group: C-Clover

Participants

Group photo

group photo


Collaborative Idea Title

Squashing conceptual bugs when computational bugs have been exterminated

Context / Research Domain

In physics the theory behind the code is as complex as writing the code itself. Taking over code-bases and understanding the in-depth theory, as well as, understanding the simulation code is an almost impossible undertaking. Finding errors and bugs in that code-base should be easier to enable sustainability of codebases and facilitate collaboration and ensure correct results from the computation.

Problem

The computational ones can be captured by the machine itself, but it’s difficult to track and work on logical & conceptual bugs. Conceptual bugs are particuarly difficult to find individually, when the code was written in the same mindset that the theory was derived and the tests were implemented.

Example 1: Logical Errors - The code executes well but doesn’t give the desired results. Example 2: Statistical Errors - The sampling works but should have sampled from a different source (see Illustration section)

Solution

The actual solution might be a standardised framework/guide similar to The Turing Way or epanding on it. However, rather than a tool, it could be helpful to have some good practices and community conversations around practices to reduce the gap between the conceptual/logical errors and their resolution.

Diagrams / Illustrations

Example of Conceptual Errors

Heat-map plot from a well-established physics simulation software, showing a central ‘dip’ (cold spot) caused by statistical sampling error. Code works exactly as written, but there is a conceptual misunderstanding and the wrong distribution is used. The sampling works but should have sampled from a different source [CC-BY Eli Chadwick]

Diagram of Proposed Solution

Diagram of Conceptual Framework encompassing Physics and Codebase with interactions CC-BY Jesper Dramsch Diagram of Conceptual Framework encompassing Physics and Codebase with interactions [CC-BY Jesper Dramsch]


CC-BY logo Licence: These materials (unless otherwise specified) are available under the Creative Commons Attribution 4.0 Licence. Please see the human-readable summary of the CC BY 4.0 and the full legal text for further information.


Discussion Notes:

Eli: Interacts with biology (e.g. bioconda) and wonders how we can integrate this into physics eco-system. Uses infrastructure to deploy to “Galaxy” and therefore has to use conda.

Jiada: Uses different software packager to include more software than just conda.

Juncheng: Different project proposal about compatibility of complex software dependencies. Problem is that small groups don’t have enough resources to take on the scale of this problem. They want to use differents software features together, but this is difficult to achieve, since different softwares have different dependencies.

Jiada: How do you put them together?

Juncheng: They create different environments for the softwares but combining them is impossible.

Eli: Has used containers for getting a single environment to work but not to resolve dependencies.

Jiada: Different containers for different dependencies wastes resources in the communication of these containers.

Jiada: Does anyone know how to guarantee code quality

Aman: Two ways:

  1. Code Reviews
  2. Developer Standards

CI/CD pipelines to check for code.

  • Type checking
  • Tests
  • Formatting

Jiada: How do you balance code quality and time? Writing tests wastes a lot of time, so how do you make sure the development is still efficient?

Aman: Use Pytest and Hypothesis for testing and automation.

Jesper: Hypothesis is great for property-based testing. Start with minimal tests don’t over-write tests, essentially re-creating a Waterfall method at that point and having to discard half their test cases realizing something different was needed. Add more tests as needed.

Eli: Developer left, he wrote most of the code and has the physics knowledge. Eli knows some but not all in-depth. He now has to maintain the code, being the only one left on the project.

Aman: Write good documentation, use questions to inform additions to documentation. Docstrings are great. Accessibility issues with just asking and not documenting. Also great if the developers are directly accessible (and have the bandwidth to help) via open communication community channels like GitHub, Gitter etc.

// General discussion about possible topic.

Eli: How do test for conceptual bugs over just code bugs?

Aman: Very difficult to catch these errors.