The training data is not there for recreating the model. The training data is there so you have access and can validate that indeed all the ingested data is either CC0, or some other license and thus derivative work produced with that model which adheres to this other license fulfills the requirements of the training data's license.
You don't have to move both together until you fork it with the intent of distributing it.
[1/n]