Machine Learning in Elixir: accuracy on the test data

Title: Machine Learning in Elixir: Chapter 1
Example: Flower classification - final accuracy on the test data is inconsistent.

When completing chapter 1 and evaluating our trained model on the test data I find the results can be inconsistent based on the shuffle of the original dataset. My first pass got 0% accuracy on the test data, but 96% accuracy when I evaluated using the training data. My second attempt (after re-running the Livebook cell that shuffled the training set) my model got 40% accuracy on the test data. I’ve tried various iterations of set sizes, iterations, epochs, etc but I cannot get a trained model to be more than 40% accurate on the test data. Is there some way to make the results more consistent?