Results 1 to 9 of 9
  1. #1
    Join Date
    Apr 2010
    Posts
    3

    Unanswered: The Role of the Database Server in Reporting

    I'd like to discuss the role of the database server in reporting applications.

    I usually work on data modeling, but one of my clients has asked me to help write views for reporting. We are using SQL Server, and we need to write views for Business Objects, for Salesforce, and for analysts who do ad-hoc reporting on their PCs (mostly with Access).

    While working on this project, I find myself repeatedly running into a debate about what tasks are appropriate for the database server to perform. Here is a specific example: For each donor displayed in Salesforce, we'd like to show a comma-delimited list of all the recent contributions from that donor. I know how to write a TSQL query to generate that list. Whenever I see that type of query discussed on a database forum, however, I also see strong warnings that functionality like that shouldn't be written in SQL Server; it should be implemented by the reporting tool instead. Those warnings seem to apply to all formatting tasks, like customizing the appearance of numbers and dates; combining prefix, first name, last name, and suffix into a full name; combining address components into a full address; annotating data with symbols to communicate special details; etc.

    Up to now, I've been ignoring the advice that says these tasks should not be performed in SQL Server. However, I'd really like to understand the theory behind the advice. Can anyone summarize the argument for me?

    My perspective is this: I've always assumed that the database server, in addition to being the transaction processing database, may also serve as the first layer of the reporting system. It is important to keep those roles conceptually separate. However, if you're careful to leave the transaction processing database alone, then it is perfectly appropriate to write reporting views that format data. Is my reasoning flawed?


    -TC

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    conceptually, reformatting and other "prettifying" operations (which would include your example of generating a comma-delimited string)belong in the application layer for one very good reason

    and that reason has to do with the fact that the database engine is a shared service, and you should offload all cycles from a shared service to the distributed layers as much as possible

    so your application layer should be viewed as local processing as opposed to central/shared processing, even though it's often located in a web server (e.g. php)

    so the rule of thumb for database access is: get in, get out, grab the data with as little extraneous processing as possible, and do any additional processing further away from the shared resource
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    Quote Originally Posted by r937 View Post
    and that reason has to do with the fact that the database engine is a shared service, and you should offload all cycles from a shared service to the distributed layers as much as possible
    ...and also because coding presentation-specific logic into your database objects will result in a plethora of one-off sprocs and views that are each dedicated to a single report/form, but duplicate each other's logic. If you concentrate on simply returning the essential data to the interfaces and let them decide how they want to display it, your code will be much more reusable.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  4. #4
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    In addition to the points raised by both blindman and r937, data (and therefore the database) should be kept as free as possible of "application specific" features.

    Applications come and go, and as a developer you often end up supporting two or more generations or "incarnations" of an application at once. An example would be "fat client" versus "web page" versus a "hand held" (pda/phone/iPad/etc) application, all of which use the same data at the same time, but with very different presenations. If you can keep the data / database relatively pure, it will make your life MUCH easier!

    As a corollary, presentation can be locale specific. For example, time/date/number/etc formatting changes depending upon the locale of the audience. Lists are presented differently too (commas don't make sense in some locales and are actually inappropriate in others). Formatting needs to be kept at a level that is "locale aware" if you want that formatting done correctly.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  5. #5
    Join Date
    Apr 2010
    Posts
    3
    Thanks for the replies. Please bear with me -- I still don't get it.

    Suppose the reporting tool hits the database with a request and gets a nice, normalized set of data in reply. Now it has to prettify that data. Isn't it reasonable to use SQL for some prettifying tasks? (I know SQL isn't ideal for that, but it isn't bad either -- I can write a query to generate a comma-delimited list much faster than I can write procedural code to do the same thing.) So, if it is okay to run a few SQL commands as part of the reporting system, why isn't it okay to use SQL Server to run those commands? Why can't SQL Server be the first layer of the reporting system?

    I understand the reason for separating the data layer from the application layer, but that's easy. I can keep all the application-specific database objects in a different schema or a different database, or house them on a different server. When my presentation requirements change, I can easily overhaul the presentation views without affecting the transaction processing database. I don't see why the need to separate the data layer from the application layer is an argument for avoiding presentation views in SQL Server.

    Also, Blindman points out the value of avoiding one-off constructs that duplicate logic, but that sounds like an argument in favor of using SQL Server for prettifying, not an argument against it. If each reporting tool needs to present the same comma-delimited list, then they must duplicate logic. If I implement that logic once on the database server with a view, then I've avoided redundancy, no?

    R973 emphasizes the value of getting in and out of the database quickly to reduce the load on a shared service. I agree that's important, but that's part of the general cost / performance balance. I don't see why I can't write SQL Server views for the reporting system and let the DBA worry about what box to use for hosting those views.

    I don't necessarily disagree with you guys. I just need to argue this through to make sure I understand the details.


    -TC

  6. #6
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Your questions seem directed toward r937 and blindman, but I'll step up since no one else has replied yet.

    A lot depends on the scope of your problem. If a problem is small enough, almost any solution will work. The "rules" for separating the solution into "database", "app server", and "presentation" are based on the assumption that the task itself has grown to the point where you truly need multiple machines to solve the problem. If one machine can handle the load, there is no need to divide that load across multiple machines and no point in doing it.

    As problems become more complex, you typically need more hardware to address the issue and provide a solution. Adding support in different locales (cultural and linguistic combinations) is a major source of solution complexity.

    Once a problem scales to the point that application servers are necessary, the presentation layer (whether that be a fat client, an HTTP server, a WAP server, or something different) should deal only with the application server, the presentation layer should no longer deal directly with the database. While it isn't exactly the same thing, see the OSI Model to help you understand the process of layering and why it is important (in my mind critical) to the success of a distributed solution.

    The business of formatting (what characters to use, how to organize presentation, etc) are controlled by locale. Formatting time, money, lists (comma separated doesn't make sense in some locales) are presentation issues that need to be locale aware.

    In my mind, the database should NEVER be locale aware... That is a receipe for system failure and for major application support expense. I can't imagine that I would ever approve any code change that required pushing locale information down to the database if there was an application server.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  7. #7
    Join Date
    Apr 2010
    Posts
    3
    Mr. Phelan,

    Thank you for great explanation. It helps to know that this issue is primarily concerned with applications that need multiple machines. That helps me put things into perspective.

    Also, you've made the point that in some circumstances, formatting rules are location-specific. None of those circumstances apply to my project, so it is hard to see that as an important consideration.

    I appreciate your OSI reference. I'm trying to understand the OSI model, but it is way over my head right now. I'll keep studying it.

    By starting this thread, I was hoping to gain a clear understanding that I could apply at work. Frankly, however, I still don't know what to do. The bottom line for me is that I have a quick, easy, and obvious solution to a problem, but people are telling me it is a bad solution. And when I investigate why it is a bad solution, the answer seems to be that it could impair performance (although it doesn't in my case) and that it could limit my distribution options because of issues like location-specific formatting (although those issues don't apply in my case). And when, in the interest of good form, I look for a better solution, all I find are ugly, expensive, and incomplete options which offend my sense of elegance far more than prettified SQL Server views do.

    -TC

  8. #8
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    As long as all you have to solve are small and simple problems that one machine can handle, there is no point in having separate database, application, and presenation servers. That level of complexity is overkill for simple problems, at least in my mind.

    The potential flaw in that reasoning is that simple problems rarely stay simple... As an application gets more users and more options, it will become more complex. If you've ever worked on a project at home that was originally solved with "urban engineering" that worked fine for days/weeks/months/years but eventually crumbled under load, you have some insight into the scope of problem that you can incur when dealing with a poorly planned or executed software configuration that crumbles under its load. This is an ugly place to be, and nearly always requires a lot of time and expense to fix.

    If you've found a solution that works for you and you are comfortable with that solution, then you've got a good solution for you. Please give some thought to whether you can see a point where your implementation will fail, because if you can imagine a failure it is safe to assume that you'll have to deal with it in the foreseable future. Once you've given that some thought, if you're still comfortable with your solution, then go for it! By the same token though, if you come to ask us for help you can expect the same kind of comments you'd get from a plumber or electrician about your home project because we've had to fix so many of those problems in the past!

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  9. #9
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    Quote Originally Posted by T.C. View Post
    Suppose the reporting tool hits the database with a request and gets a nice, normalized set of data in reply. Now it has to prettify that data. Isn't it reasonable to use SQL for some prettifying tasks?
    No. This is a bad practice. For reasons of CPU load re-usability of code, as mentioned above.

    Quote Originally Posted by T.C. View Post
    (I know SQL isn't ideal for that, but it isn't bad either -- I can write a query to generate a comma-delimited list much faster than I can write procedural code to do the same thing.)
    Congratulations on that. But later, when you want a report of the same data in NON-comma delimited format so that it can be easily imported into a spreadsheet and sorted/copied/filtered, what are you going to do? Right the exact same SQL code without the CSV conversion? Write procedural code to parse the CSV string back to a usable dataset?

    Quote Originally Posted by T.C. View Post
    Also, Blindman points out the value of avoiding one-off constructs that duplicate logic, but that sounds like an argument in favor of using SQL Server for prettifying, not an argument against it. If each reporting tool needs to present the same comma-delimited list, then they must duplicate logic.
    And if one manager asks for the list in tab-delimited format? Or semi-colon? Or alphabetically ordered?
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •