That's interesting, explicitly to require that bugs be reproduced in CI. It makes sense in theory, but in praxis (IME) CI systems tend to be overtaxed / underprovisioned - meaning this extra burden might be questionable. /$.02
Given the cost a typical CI system vs the salary cost of a typical dev, the business case should be fairly easy that devs always should have CI capacity available.
Ideally your CI workflow prevents merging something that breaks the tests. Also now if the first review merges but the other doesn't for some reason you've broken everyone else which is bad.
EDIT: at least in rebase workflows which is what I'm used to. I guess in merge workflows this works.
1. Merge #2 into #1 then merge the result, but this can obscure your history
2. Don’t make the test fail. Eg expect the incorrect behaviour and leave a comment explaining why it is incorrect and what the correct thing should be.
1. Introduce new test exhibiting bug
2. Introduce bug fix and update the test
If 1 and 2 are reviewed together you have less evidence that the test actually shows the bug being fixed.