Category talk:Transaction processing

From Wikipedia, the free encyclopedia

I'm just getting started on a pass over this new category. The inclusion of pages under this heading is not a commentary on the quality or precise content of those pages, but rather on what I believe those pages should contain. JCLately 04:34, 19 May 2007 (UTC)


The motivation for introducing this new category was to make it easier to locate articles on topics strongly related to the subject of transaction processing. In light of the large number of such articles, the absence of a transaction processing category seems a glaring omission: it's an important subject, and this category could greatly facilitate rapid discovery and review of what Wikipedia has to offer.

In attempting to assign various articles to this category, there is some question as to how deal with the fact that many have already been assigned to broad categories like Category:Concurrency control and Category:Data management. According to Wikipedia:Categorization#Guidelines, one should generally avoid assigning an article to both a category and one of its subcategories.

The problem is that Wikipedia's category system is not a strict hierarchy, so while it is reasonable to consider TP as a sub-topic of concurrency, it is also reasonable to consider concurrency as a sub-topic of TP. The two subjects have a large overlap, but each also extends considerably beyond the scope of the other. The fact that an article has been categorized under concurrency control should not preclude its also being filed under TP, even though concurrency control may be loosely regarded as a "parent" of TP (and vice-versa).

Some editorial discretion is required to decide which concurrency articles are appropriate for inclusion under TP, and whether to remove any other category assignments in the process. The subject of data management wholly subsumes transaction processing, so it generally would be reasonable to remove this broad categorization when assigning an article to the more specific category of TP, unless the article happens to cover substantial data management subjects beyond the scope of TP. On the other hand, concurrency does not have a strict parent relationship to TP, so assigning an article to TP generally does not warrant its removal from concurrency, unless the only connection to concurrency is that implied by the connection to TP, which is seldom the case.

I propose that articles should be freely added to the TP category if they are strongly related or of fundamental interest to TP, so this category can properly serve the purpose for which it was intended. This should not include concurrency articles that are only remotely related to TP, but it should not exclude concurrency articles purely on the basis that concurrency is a "parent" category of TP. Rather than disallow an assignment to TP, consider removing the article from the broader category, if warranted, but allow that this frequently will not be the case. JCLately 07:24, 21 May 2007 (UTC)

Perhaps it would be helpful to just move Category:Transaction processing out from under Category:Concurrency if tehre isn't a strict parent-child relationship. Just make them parallel categories.
As an aside, I have to admit to being skeptical that "concurrency" in general is in any way a sub-topic of transaction processing. But I'd be interested in hearing why you believe it could be considered so. --Allan McInnes (talk) 01:36, 22 May 2007 (UTC)
There is too much overlap to pretend that TP and concurrency control are mere siblings. TP is the way that modern DBMSs handle concurrency control, mostly implicitly, but also through various options and explicit locking operations. Because the details of concurrency control are largely hidden from the user, TP hugely simplifies application development related to synchronizing dynamic database updates by concurrent users. However, in devising an efficient, scalable database application design, one needs to understand some of the more subtle details of concurrency control. For example, considerations of Locking, Deadlock, Optimistic concurrency control, Two-phase commit protocol, etc. can be extremely important to the design and implementation of transactional databases. I've tried to be selective about the particular concurrency topics directly categorized under TP, to focus on those of significant interest. This still leaves many other concurrency-related articles that I did not judge to be sufficiently related to TP to merit inclusion in this category.
As my own aside, what I was alluding to in my original disclaimer about "not a commentary on the quality or precise content of those pages" is my opinion that many of the articles I've categorized under TP - how shall I say it - suck. Before one can hope to put these into better shape, it would be helpful to have a quick way of reviewing all this. That was the point of this category, and I think it should be interesting to anyone studying the exciting field of transaction processing. JCLately 04:52, 22 May 2007 (UTC)