Hello,
Our software commonly returns a full result set to the client and we use the DataTables plugin to display table data. This has worked well, but at datasets grow larger, we are trying to move some of these requests server-side so the server handles the bulk of the work rather than the client. This has had me scratching my head in so many ways.
I'm hoping I can get a mix of general best practice advice but also maybe some IRIS specific ideas.
Some background
- I would admit our tables aren't best optimized for SQL in the sense that we don't utilize parent-child relationship in tables rather we point to other tables as property object references and often index on those pointers. I don't think this is uncommon, yeah? I think it becomes an issue when we are accessing a property by the pointer pointing to another pointer and so on until we make the object reference we want.
- I assume this next concept is common: depending on the table, some of the data reported to the client is calculated rather than specifically derived from the table or pointers. Or the data itself may be a link so there's an <a>Data Name</a> around it. This means if we need to filter data based on the calculated data we can't rely on just the SQL. We call a base query and then load the resulting object IDs into an array and then loop through the array and process, transform and filter the data in ObjectScript.
- Maybe that previous point isn't so common? Do I need to buckle down and learn how to get a lot fancier in SQL? We have methods that transform data in ObjectScript, but I don't think I can call those within ISC SQL. But really, at the end of the day, the SQL translates to a bunch of $ORDER-ing through indexes and such to get the data we need, right? So by calling a base query and then further processing in ObjectScript, I'm just doing what the SQL would have been doing in the guts. I know I'm oversimplifying, but does that make sense?
- A use case for example is that we have a table of lets say 1 million patients. A base SQL query can performantly get that down to let's say 100,000 patients that belong to a specific facility. I only need the first 10 to show on my client side DataTables view, but since I'm processing it all server side and don't want to return 100,000 results to the client, just 10, I still have to return the count of 100,000 to client. I think that's easily done with SQL COUNT, but then things start to get trickier when I add a filter or search text. Now I have to tell DataTables on the client side how many filtered records I have. So if someone searches for "Sally" the query will count that there's 100,000 records, return 50 of them that have "Sally" (another point on this later), and then only return the first 10. Or 11-20 if it's page two. And so on. Ok, that's still possibly all in SQL. But what if "Sally" is in the patient name? Or in the responsibly party name? Or the street name ("Sally St.")? Or in the city name (ok, no US cities name Sally's Cove in Newfoundland). Now my SQL query becomes more complex (WHERE FirstName Like %SALLY% or Street1 Like %SALLY% or Street2 Like %SALLY% . . .). Start adding other optional filters as well as search text. Now the filters are filtering on calculated data - now the SQL query can't determine the TOTAL count and the FILTERED count and my pagination breaks. Also, we have to consider the ordering of the results which can be done in the query if you can point to or calculate data in the query, but as soon as one this can't be determined from the query with ordering you either have to disable ordering on that column or write the simplest query and handle all the ordering, sorting, filtering, in ObjectScript.
I hope this help paint a picture and I hope folks have some ideas or experiences they can share. Some other ideas swimming in my head:
- Maybe we need to create temp tables or derived tables so the data is there and waiting as we want it for the request - this seems like a big project though. How/when do the tables refresh/update. Do they live in the same namespace? What if it's being updated and the user calls it are there locking issues?
- As I said before, maybe I just need to get better at SQL - I know people write way more complex and performant queries that I can even imagine.
- Create better indexes and table relationships - if we know we are pointing to a pointer 4 or five tables deep and we somehow connect the source table to that table with a new object property and index it appropriately?
- Get better at reexamine client versus server activities - in the case of Datatables, I'm trying to do all the table init config in the payload from the server so that I'm just passing in a block of JSON to the init and no having to fiddle with client side config. Same with formatting. Maybe I need to be better at letting the client handle the things the client it better at handling (e.g. forming data into URLs) and let the server just focus on raw data.
I would LOVE to hear your thoughts, folks. Thank you for any resources or best practices or real word experience or whims you may have!