This is known as zero-shot learning: a network that learns English-German & English-Chinese can produce very good German-Chinese translations. I am sure standard Transformer models like BERT and GPT-3 are very capable of zero-shot learning.
I wrote an article on GNMT (Google Neural Machine Translation), which powers Google Translate. Since it is obviously impractical, both in terms of training time and data collection, to store n^2 languages, its structure was designed to aid zero-shot learning, as well as address many other problems that come with translation.
Hope this helped!