Just my two cents, but I have a very different view to what others have suggested.
The mainloop structure can typically be applied to any game, from Tetris to Crysis. The difference is just the amount of stuff happening whilst it’s painting and updating.
However IMHO point and click is one of the few game genres which really don’t match that structure as well. It’s very innefficient to use a mainloop for games which don’t update or paint constantly (like most point and click titles). Secondly a point and click is far more GUI oriented then most games; buttons and application style mouse interaction are more central to how users interact. That is why I would recommend building this as a conventional event-driven Swing app rather then using a mainloop.
Most point and clicks also use their own domain specific language for scripting the game and it’s events (i.e. the text and options to display when characters are clicked on) as it can speed up development by reducing the amount of code needed to be written and by loading/reloading scripts at runtime. This is usually one of the main aspects as this is used to build the actual content to the game itself. So I would advise looking into lexical analysers and grammars, such as JFlex and CUP, as you’ll almost certainly need to roll your own point and click DSL.
For research I’d look into the scripting langauges used by Scumm and other point and click engines as a basis. See what works well, what doesn’t, which bits you’ll need and which bits you don’t want. Then I’d find a book on building parsers in Java, but would advise one that is not game oriented (most game oriented ones just skim over how to embed existing languages).