PDA

View Full Version : JCR integration status?


michaelbaranov
May 5th, 2006, 03:13 AM
Hello!

I was wondering what's the present JCR support status? I can't easily find any reference to JCR in Spring... Is there a module already?

BTW, Apache Jackrabbit has left the incubator with 1.0 release a while ago. I expect a growing demand for JCR soon!

Thank you!!!
Michael.

Costin Leau
May 5th, 2006, 05:11 PM
Hi Michael. Spring Modules contain a JCR module on CVS - it has been there for the last 6 months. The CVS version has been updated to use Jackrabbit 1.0. The next release of Spring Modules will contain the JCR module no doubt.

michaelbaranov
May 7th, 2006, 03:12 AM
Hello!

I have given a closer look on jcr springmodule and on Jackrabbit impl itself. Well, what to say... Jackrabbit is almost useless for production, IMHO. Not to offend the developers, but the support for db storage is feeble... File system storage unusable because of a single fact: the whole repository may be damaged if the jvm is halted abruptly. There are heavy discussions on dev forum, but things do not seem to go changed soon.

So the problems are:
1) DbFileSystem & SimpleDbPersistenceManager obtain jdbc connection themselves, one per instance, so because they are effectively singletons, there is 1 connection per repository.
2) No connection failure handling. If that connection fails, we are doomed!
3) No way for the jcr session to perticipate in a data access transaction along tith other jdbc calls
4) The configuration mechanism is ... hard to use

I would like:
1) To be able to use data source (jdbc connection pooling )
2) Participate in jdbc transactions
3) Configure the repository via Spring IoC, not via repository.xml

So the solution I came so far:
1) Based on DbFileSystem & SimpleDbPersistenceManager, write OWN implementation that use JdbcDaoSupport. (not that difficult, will look much like DAO object, stateless and threadsafe)
2) Use already existing JcrSessionFactory, JcrTemplate (possibly slightly modify)
3) Modify existing JcrRepositoryFactoryBean to be able to use IoC configuration: wire in the FileSystem, PersistenceManager etc. Those are then internally wrapped in FileSystemConfig etc. and a RepositoryConfig is formed.

The wiring may look like:



dataSource -> dataSourceTransactionManager -----------------------------+
| |
v |
jdbcTemplate --> JdbcDao1----------------------------+ |
| | ... | |
v v v v
FS PM serviceTraget --> transactionProxy (for service)
| | ^
v v |
jcrRepositoryFactoryBean <--... |
| |
v |
jcrSessionFactory |
| |
v |
jcrTemplate --> jcrDAO1 --------------------+
...



In my idea this will allow to demarkate transactions on the service. The transactions will span both jdbcDaos and jcrDaos. What do you think? Is it going to work? At leaset I can't see an easier solution for the problems for now...

Michael.

Costin Leau
May 7th, 2006, 03:40 AM
Indeed things could be cleaner with Jackrabbit and unfortunatelly there isn't any other implementation out there that has the same features and that's free (Apache/BSD). Jeceira hasn't been updated in about 5 months or so and eXo is GPL.
From what I've seen in the sources, Jackrabbit does a lot of the management internally and the code is very defensive. The configuration still can't be taken out of the xml - I've made a request several months ago and even though the feature was implemented quickly, due to some predefined defaults or assumpation it was crippled since it could have been misused.

If you want to extend/replace jackrabbit through some spring based components feel free to do it - I can incorporate it into Spring modules. However, note that this will transform the JCR module from support to an implementation of JCR and that's not what is intended for.
JackRabbit is fairly set up and changes can't be too big or happen very often. I would suggest looking at the rest of the implementations (like Jeceira) and help with these.
Don't get me wrong, feedback and code contributions are welcomed; however, you would have much more freedom and control by doing your own implementation then modifying an exiting one (see the changelog and how many modifications have been done between small releases).

michaelbaranov
May 7th, 2006, 06:22 AM
I agree that my preposal is not a clean integration approach, but combined with re-implementation of FS & PM. But currently I do not see any other way to technically do it.

Jackrabbit's developers are very concerned to be used the 'right' way, and not to be used 'wrong' ;-) Using Jackrabbit as a layer between DB or a WebDAV gate to DB they consider as personal offend!!! :confused: Unforunatelly Jackrabbit is all that we have in Apache/BSD/LGPL domain...

Actually my idea seems not to require modifications to the Jackrabbit implementation itself, and that's great: still stay on top of, not inside or instead.

Unfortunatelly I don't plan to contribute anything to Jeceira integration, because of the state of Jeceira itself. In current state it just seems to be unusable.

Costin, I was actually wondering if my approach is gonna work from the tech side, not how clean is the idea (pretty dirty). Are there any hidden spikes to be aware of? Actually, the JSR-170 spec says thet this transaction approach is correct, even considering possibility of multiple jcr sessions in one user transaction. So, Costin, is it gonna work out?

If it works, I will contribute back all the code to the communities, both Jackrabbit and Spring. Because of the 'unclean' approach, I'm not sure if it should be put into jcr integration springmodule... But it is up to others to decide. Hope this little effort will help others actually use Jackrabbit with Spring and an RDBMS in a more consistent way.

Costin Leau
May 7th, 2006, 12:40 PM
Are there any hidden spikes to be aware of? Actually, the JSR-170 spec says thet this transaction approach is correct, even considering possibility of multiple jcr sessions in one user transaction. So, Costin, is it gonna work out?
I haven't worked on Jackrabbit itself so I'm not the best judge. However, from looking at their internals I know they do some tricky stuff in there and it seems that the cache functionality is a bit too present.
Using multiple sessions inside the same transaction can happen but remember that even in the well established DB world some things do are not supported by all implementations.
Moreover, JCR does not address transactions and JR does not fully suport them - you can see that they still have issue (see Jackrabbit jira).
Basically, I'm saying that due to the lack of implementations out there you are pretty much on your own.

michaelbaranov
May 7th, 2006, 04:10 PM
I do not expect any problems from transactions. FS & PM are the only places that use connections. Using JdbcTemplate there will guarantee that any code using the jcr session can be effectively wrapped into a transaction proxy. I do not want to mess with JTA transactions in any way, explicit or implied by JR impl., so hopefully all their transaction bugs will not matter :) Actually, neither the jcr session, nor the repository will not even guess they are in transaction. Actually, I do not even fully understand how transactions supposed to work in current impl... :o How do they manage transactions with their SimplePersistenceManager & DbFileSystem who commit or rollback at their will?:confused:

Meanwhile I have implemented FS using JdbcTemplate borrowing code from DbFileSystem. Unfortunatelly, I can't see a way to use prepared statements anymore and BLOB streams have to be buffered first and wrapped to pass them to outside (no way with JdbcTemplate to leave the ResultSet open?). I hope this will have no or little impact on performance.

Being addicted IoC and Spring's clear and open structure, JR impl. seems very closed and self-contained. All this I'm trying looks more like cheating ... :D

Costin Leau
May 7th, 2006, 04:29 PM
All this I'm trying looks more like cheating ... :D Yes, that's why I was suggesting to work on a new implementation then extending one that has different principles.

Jukka Zitting
Jul 13th, 2006, 05:32 AM
Just found this thread. It's a common complain that people find the Jackrabbit configuration inflexible and the database integration limited, but let me explain why we don't see this as a major issue with Jackrabbit.

Jackrabbit is designed to work on the same tier as traditional databases, i.e. it is not really meant to be run on top of a separate database. The configurable persistence managers are really more like different table storage models (btree, hash table, etc.) in relational databases, and should not be considered as integration points.

In addition, the JCR transactions have nothing to do with transactions of the possible underlying database persistence managers. We handle transactions explicitly on top of the persistence layer, so you should not try to include the database persistence managers in any managed transactions. However, as you mentioned, the transaction support in Jackrabbit still has some problems.

I hope this explains some of the issues you raised.

BR,

Jukka Zitting

Costin Leau
Jul 13th, 2006, 05:50 AM
Just found this thread. It's a common complain that people find the Jackrabbit configuration inflexible and the database integration limited, but let me explain why we don't see this as a major issue with Jackrabbit.
Thanks for joining the conversation Jukka. One thing that would be really nice (we actually talked about it some time ago) would be to be able configure jackrabbit entirely in a programmatic manner. I must say I haven't followed the issue, but last time I've checked it was impossible since some of the classes had default/private members/methods making it impossible.
(programmatic configuration adds the big advantage of passing already preconfigured objects).

Jukka Zitting
Jul 13th, 2006, 06:04 AM
One thing that would be really nice (we actually talked about it some time ago) would be to be able configure jackrabbit entirely in a programmatic manner.

I agree, that would be nice. I've recently done some work (TransientRepository, etc) to make Jackrabbit a bit more pluggable, and I'd actually like to get rid of the current hardcoded configuration parsers and replace them with IoC-friendly interfaces for use with tools like Spring or Apache HiveMind. I suppose that would also help you.

Costin Leau
Jul 13th, 2006, 06:14 AM
I agree, that would be nice. I've recently done some work (TransientRepository, etc) to make Jackrabbit a bit more pluggable, and I'd actually like to get rid of the current hardcoded configuration parsers and replace them with IoC-friendly interfaces for use with tools like Spring or Apache HiveMind. I suppose that would also help you.
Definitely - I've already integrated TransientRepository into a FactoryBean - the code will be out in Spring Modules 0.5 release.