What are modern/Typical rendering/preprocessing optimizations?

What optimizations are used to render scenes? I can’t imagine throwing all the polygons of scene at the graphics card - letting it clip/cull and z-buffer - is wise and is done in practice. What is typically done on the cpu side? Is software frustum culling/BSP trees done given modern GPU’s?

I’ve parsed an old Quake 3 file, and did brute rendering the entire scene using a global Vertex VBO and individual index vbos per face… Fly throughs take 50-80% CPU @ 1024x768 on a 2.2GHz machine w/ GeForce 6800 . Clearly, some optimizations can be done.

I guess I’m not sure what to delegate to the GPU and what should be done on the CPU side to determine what gets rendered. What optimizations get the biggest bang for the buck in terms of preprocessing and runtime processing data?

Any advice appreciated,

Monty