I tested the first quad, just to see what I came up with and this is it.  Started off promising, but tailed off a bit at the end. LOL
Transconductance:  6000 = New Tube
Tube 1:  Tested Good on pass/fail and GM @ 6250
Tube 2:  Tested Good on pass/fail and GM @ 6000
Tube 3:  Tested Good on pass/fail and GM @ 6100
Tube 4:  Tested Good on pass/fail and GM @ 5500
If my math is correct (very well may not be), that's a 12% spread in GM between Tube 1 and Tube 4.
Since most amps utilizing quads would have two pairs working in tandem, you could match those pairs up as close as possible but still you'd have a pretty wide spread between one of the pairs and if I'm thinking right your bias would be jacked up too depending on which tubes you had your bias probe on:
Tubes 2 and 4, you'd have an 8.3% spread.
Tubes 1 and 3, you'd have a 2.4% spread.
I don't know what would be acceptable for a "matched" quad, but I'm thinking a 12% spread between all four ain't it.  LOL  I think I read somewhere that 3% or less variance is considered acceptable.   I'm going to go through all of them, and also do some double checking as I have no idea when the last time my tester was calibrated so I'd like to see results duplicated before I place too much weight on all of this.
I'm lucky that I have a tester, and could just take all 12 tubes and match them up based on GM.  But there are no markings on any of these tubes or boxes to indicate how, or even if they were matched.  You just have to take it on faith that they are, and so far...  I don't have much faith. LOL