haven't been here for a while, glad to see so many new faces these days.
Without knowning your budget, my personal suggestion is:
1. A good storage technology, normally neither OS file system nor
RDBMS is not efficient to handle large amount of data write.
2. Enough CPU power to process and transform the data.
3. If you have high query load on tables with millions of records, of course, clustable DB is a must.
I don't think jms can really help as a) your process flow is very simple b) few jms products are really scalable, to avoid lost data, they still have to store it on disk temporarily. c) messaging system like Tibco, MQ are great, but very expensive.
Don't want to sound salesy, few friends of mine built a high volume email system that they claimed can handle 5-6 times of email load as what yahoo have today with a cluster of cheap hardwares. the core technology is input mail( can be huge) disk caching (designed by a key architect from oracle). I think their technology solve the similar problem as yours. BTW, they are doing business in China now.
nice to talk to you, I was too busy with my own stuffs. To stay in shape, I have to give more time to sports than surf the web ...
I used JMS heavily and in general it's very good solution for process oriented system. but to handle 500 files(not simple messages less than 4k each) per 2 seconds at peak time, JBoss, WebsphereJMS( not true MQ) or weblogic(I work for BEA and I know their storage implementation) JMS can't stand that load or not efficient in term of hardware usage.
What I am saying is there are better/cheaper/less overhead solution in this case.