Combining learning with knowledge-rich planning allows for efficient multi-agent solutions to the problem of perpetual sparse rewards