Developing metrics and a catalog of applications to assess different kinds of Kubernetes performance.
We likely will choose different metrics that are important for HPC.
Note that I haven't started the operator yet because I'm testing ideas for the design.
To learn more:
Figure out issue with errors.IsNotFound not working...
We need a way for the entrypoint command to monitor (based on the container) to differ (potentially)
For larger metric collections, we should have a log streaming mode (and not wait for Completed/Successful)
For services we are measuring, we likely need to be able to kill after N seconds (to complete job) or to specify the success policy on the metrics containers instead of the application
Add assertions checking for python tests
Plotting examples (python parsers) needed for
io-sysstat
app-kripke
app-quicksilver
app-pennant
License
HPCIC DevTools is distributed under the terms of the MIT license.
All new contributions must be made under this license.