Thursday, February 29, 2024

Breaking up with SQL Server

I was inspired to write this after reading a post from Dave Mason regarding breaking up with Big Tech companies. Yet again I haven't wrote for some time, my last post was about the release of SQL Server 2022 which I felt was rather underwhelming and certainly not helped by some of the issues with it.

It also came at a funny time in my career. For anyone who doesn't know me I've worked with SQL Server for many years and from my own perspective have had quite a remarkable career when I think back to my early days dabbling with MS Access. What followed those early days was a long journey developing and honing my skills after completely by accident falling into the wonderful world of SQL Server overnight, and it became my dedication.

But a couple of years ago things changed. At the time I was an avid blogger and occasional presenter who lived and breathed SQL Server as anyone who knows me well enough would testify. Unfortunately there were some things going on in my personal life which meant I had to stop a lot of those activities. I was still working but the blogs and everything else just stopped, and weirdly after a short time I realised that I didn't really miss any of it either. I definitely tried; I wrote a few blog posts here and there, I even put together a couple of new sessions but just couldn't seem to get that spark back to really get back into it.

Which leads me to the breaking up part...and I have to start off by saying, SQL Server...it's not you, it's me!

I know, I rolled my eyes a little too but I'm going to be even more cliché, deep down I do still love [working with] SQL Server, we've been in a relationship for a long time but over time things and people change. There's just something that isn't there any more and to be truthful, it's time I moved on.

OK, that's enough of the technical equivalent of a Dear John letter as I was getting emotional.

The fact is that whilst I'll still be working with SQL for sure I realised I do actually miss the blogging, the speaking and the rest of it, it's just the subject matter that I had grown tired of and so it's time for something a little different.

Over the last couple of years I've been working with some different platforms and technologies; AWS, GCP, PostgreSQL, Python and a whole bunch of other stuff. It's not just me of course, this is how the technical landscape is these days so if anything it was very natural I suppose to diversify my technical knowledge and skill set. But when it comes to really getting into a technology I'm shifting focus specifically towards some of those areas and reigniting what I once had, and it's been a while since I said this, but I can't wait to see how things develop :-)

Tuesday, May 30, 2023

SQL Server 2022

It's been almost a year since I blogged about anything so it's probably about time that I blew the dust off the keyboard and got back into action!

In the tech world a lot has been happening whilst I've been away and in the world of SQL Server the biggest "thing" has certainly been the release of the very latest version, SQL Server 2022. So let me get straight to the point, in my very humble opinion, it's also the most underwhelming release of SQL Server that I've ever known.

If I think about my favourite version (because we've all got one...don't we?!!?) then SQL Server 2016 gets my vote, that and 2005. In between those two we've had other releases and new features etc but those definitely stick out and let's be real, SP1 for 2016 meant those standard users amongst us got to use some fun stuff at long last without forking out for Enterprise licences!

For more recent releases starting with SQL Server 2017 we had the whole Microsoft loves Linux fanfare and being the "platform of choice". Yes there were other things too like machine learning, adaptive query processing (which I do like the idea of) and, er, resumable online index rebuilds but this release was all about the idea of the platform. 

And then 2019 came along with the vision for the Modern Data Platform and using the word BIG at every opportunity. Speaking of which Microsoft also bet BIG on Kubernetes with the introduction of the aptly named Big Data Clusters, which were subsequently canned, so much for quenching our thirst:

"Our mission is to accelerate, delight, and empower our users as they quench their thirst for data driven insights. The Microsoft SQL Server 2019 Big Data Clusters add-on will be retired. Support for SQL Server 2019 Big Data Clusters will end on February 28, 2025."

At least TDE came to standard edition at long last!

Anyway it seems to me that 2022 was a play it safe kind of release. There's no super bold moves, nothing to get really thirsty about anyway and whilst I have to admit I did roll my eyes at the whole "most Azure enabled release" tagline there's some handy features in there like the Managed Instance link. Sadly it isn't ready yet for any bi-directional replication (and with that Disaster Recovery capabilities) and it's still stuck in Public Preview!

Was it a rushed release? Undoubtedly, and unfortunately there's problems further down the Cumulative Update conga line as well which Brent has recently blogged about: Breaking News: SQL Server 2022 Keeps Getting Worse. - Brent Ozar Unlimited®

Urgh.

Now I should point something out. When I said SQL Server 2022 is an underwhelming release that isn't necessarily a bad thing, particularly when you look at the whole big data clusters [bad] idea and I for one will certainly take an underwhelming release any day of the week as long as it works. But I can't help but agree with Brent's article title that as of right now, it's a release that is seemingly getting worse, and that is what really disappoints.

But let's not end on a sour note because I do like SQL Server 2022, or at least if I put aside current issues, the idea of it. 

Cynical eye rolls aside I actually like the approach of SQL Server being more Azure enabled. Although it has limitations the Synapse link looks interesting, although tbh I am not a Synapse user so can't comment too much, although I did chuckle that enabling Azure Synapse Link and Link to Managed Instance isn't supported on the same database (but I can see why that's the case). 

To me it's like Microsoft are going back to building on the strengths of SQL Server and leveraging some of the benefits of Azure at the same time and that's not a bad thing, at least, when it's all ready that will hopefully be the case.

Monday, July 11, 2022

Conga Lines: Some Misconceptions about SQL Server Updates.

There ain't no party like a SQL update party...


For SQL Server 2017 and above Microsoft stopped releasing Service Pack updates to the product and instead moved to a new servicing model; more information about that here.

Microsoft releases updates to SQL Server in two forms; Cumulative Updates (CU) and General Distribution Releases (GDR). The key difference is that cumulative updates contain all the goodies you could ever wish for; new and improvements to features, bug fixes and security patches whilst the General Distribution Releases contain just the "urgent fixes". 

This is Misconception #1 because this type of update isn't restricted to security patches only, if Microsoft need to urgently fix something (like that could ever happen!) then it can and probably will be included in a GDR as well, although most of the time it is the case technically it's not just security related.

Misconception #2, I should always install Cumulative Updates? This is only true if you want to take advantage of all the features and functionality that are included in them. Some organisations will make the decision to only install the GDR's on to their SQL Servers which is what is known as being on the GDR train, branch or conga line (as opposed to the CU train etc). 

Why would you not install Cumulative Updates? Well let's be honest, updates to SQL Server occasionally break stuff (and sometimes they really break stuff) and I don't know of a single DBA who enjoys applying updates to SQL Server and if you do, I really want to know why! Being on the GDR conga line means you have less updates to apply over a product lifecycle and also as they're not breaking/fixing as much stuff then in theory there's less chance of anything going spectacularly wrong.

Misconception #4, GDR's aren't cumulative. Although they don't contain the word cumulative, General Distribution Releases are cumulative as well. This means when you install a GDR (and I know I keep flicking between acronym and words but I'm lazy today and not in the mood for the grammar squad), where was we?? Ah yeah, when you install a GDR you also get all the fixes from previous GDR's as well, there's no need to install them one by one. It's exactly the same for the more precisely named Cumulative Updates.

Misconception #5, GDR's don't require testing. Honestly, I have heard this several times before and it's simply not true. Just because there's (in theory) less chance of something blowing up there's still a chance nonetheless. Cumulative updates might be a little different because there could be enhancements you want to test specifically sure but any update to SQL Server, whatever the type need full and thorough testing.

Misconception #6, GDR's only contain urgent fixes. Now I know I told you this earlier on in this very post BUT there's also different kind of GDR out there, the GDR-CU.

WTH is a GDR-CU?

OK, remember the conga lines? We have one line where only the GDR's are being installed and the other line that are installing all the goodies via cumulative updates (oooh, I should mention, you can only be in one of the conga lines because being in two is way too much fun for anyone to handle). 

If you're on the CU line and you install a GDR it doesn't just install the urgent fixes. It's like a CU in disguise and you'll also get any of the cumulative updates, security fixes and everything else you've missed to that point too, hence the term GDR-CU. 

If you're just installing GDR's then in this case they'll only contain the mega important stuff and a GDR is still just a GDR.

Misconception #7, I can jump between conga lines. Well in truth you can, once, and only if you're going from the ultra cautious GDR line to the all signing all dancing CU one. Once you go CU, you can't go...well you just can't change your mind go back to the GDR one. Here's why:

That link is the build list for SQL Server 2019 which is rather helpfully split into two parts; the Cumulative Update Train (conga line) and the GDR one. Take note of the version numbers for CVE-2022-29143 which at the time of writing is the very latest update for SQL 2019 and appears in both parts, for the CU party it's 15.0.4236.7 whilst over on the other side it's 15.0.2095.3 Once you install a CU...or GDR-CU you're on that line, you literally can't install an update from the GDR path anymore, it's a lower version number and those days are well and truly gone...unless you start removing stuff (and getting updates on is difficult enough).

[Sort of] Misconception #8, Automatic Updates make everything much easier. Up until now we've covered the two distinct conga lines, GDR and CU. But how do you join up? Well the update is your ticket; if you only install GDR's you're part of the GDR party and if you install a CU you're in that one. Automatic updates and their detection logic will mainly follow the same rules, if you have a CU installed you'll get the relevant updates for that path and it's the same with GDR's - if you ONLY have a GDR path update installed Windows won't put you on the CU train just for the hell of it.

But if you have a baseline version, that is one with no updates applied? This came up in a conversation the other day where a baseline install of SQL Server 2019 had automatically updated to the latest GDR but on the CU path (the GDR-CU) and some digging around in the Microsoft Update detection logic I found this:

  • WSUS / Catalog channel that uses the new detection logic that allows the CU branch version of the security release to be applied against a clean baseline instance. This is the channel of the release that will be presented to users who go to the Microsoft Update Catalog website.

It still has to be allowed, but there's still the potential for a baseline build of SQL Server to be put firmly on the CU conga line, so it's something to be aware of if your organisation prefers to be GDR release only.

Anyway, that's enough for this post, I've got a conga line to join :-)

Tuesday, May 10, 2022

TSQL Tuesday #150 Invite – Your first technical job

 

This months T-SQL Tuesday is hosted by Kenneth Fisher and is all bout our first technical job. You can click the image above for the invite details and find out more about the monthly blog party!

I suppose I have very mixed feelings about my first technical job. Just before I moved into the role I was working shifts doing industrial printing, as much as I loved it I was desperate to move into IT. Just by chance a manager had remembered that my CV contained some college courses in Computer Studies and I that had some experience of Visual Basic, after a lunchtime interview one day I moved into a hybrid developer/support role.

The main focus of the role was implementing data processing solutions. Data was captured from two source systems, extracted to floppy disk and imported into a large database (MS Access, yes you read that right) before being output back again to files on floppy disk and transferred via mainframe jobs to their intended recipients. 

To say the process was prone to error is an understatement. Things blew up constantly and as a result we were often making fixes on the fly (the kind of fixes that were powered by hopes and prayers more than anything else). As a result the days would often be long for everyone, days off were often interrupted by phone calls and it's no surprise that stress was all too commonplace. 

We were also extremely underpaid for the work we were doing. In fact the final nail in the coffin as it were was when a manager informed me there was good and bad news after my annual appraisal. The good news was that I'd received a bonus (woohoo!) but the bad news was I'd also got a pay rise (come again?). The pay rise was, well to be frank I took it as an insult and the manager looked at me and told me that I should be aware of what I needed to do, and shortly after leave I did.

It would be very easy to write a lengthy post on how things should have been done differently, but that's experience for you and if I'm brutally honest I didn't learn some of the lessons until years later. But I started this post by saying I had very mixed feelings because despite all of the chaos I genuinely have some very fond memories of my time there.

I worked with some amazing people; some of whom I remain very close friends with today and whenever we talk we always seem to end up in fits of laughter of some of the, shenanigan's we got up to shall we say! In fact I'd go as far as to say that unless I'm prompted to write a blog post about the experience and need to give it some serious thought, I don't even think about the negatives.

So after all that, I'm not sure what the message is, or even if there's supposed to be one. It was a very mixed bag of fun and games (depending on your definition of fun) and ultimately I learnt a lot from it even if I didn't realise it at the time. The funny thing is if you'd have told me then that my future career was with SQL Server and not Visual Basic and Access I'd have laughed at you but maybe it goes to show that it doesn't matter how you get from A to B, as long as you get there, alternatively it might just mean you never know where you're going to end up!

Tuesday, March 8, 2022

More things you shouldn't do in SQL Server: Disabling Optimisation Phases

PLEASE DON'T DO THIS

In my last post I covered optimisation phases and we can view optimiser information by using Trace Flag 8675. The last post looked at the search(0) and search(1) phases and we saw how the optimiser improved the cost of the query by using both of the phases with the search(1) phase resulting in the better plan in terms of cost.

I mentioned in that post that the optimiser has another phase it can use, no surprise it's called search(2) also known as the full optimisation phase (yes that's the full optimisation phase of the full optimisation stage, still with me? Good!).

The full optimisation phase I suppose would normally be reserved for those beastly queries that the optimiser really has to get to work on, looking at you Entity Framework but we can force optimiser behaviour by using an undocumented trace flag 8677 which entirely skips the search(1) phase. 

Now it has to be said it's undocumented for a reason, the reason is that it's really not a good idea to do this. In fact enabling this trace flag is such a bad idea that it will probably cause no end of issues with query performance...so let's do it, but before we do let me add yet again that please don't do this! Disabling optimisation features is a really bad idea, just like we did in this post - the purpose for this demo is just to show that we can, and how dangerous it can get.

Right, on with the show...

DBCC TRACEON(3604, -1);

DBCC TRACEON(8675, -1); 

DBCC TRACEON(8677, -1); 

As in the last post I've enabled trace flags 3604 and 8675 and this time I've also enabled trace flag 8677. I'm going to run the following query as it won't be able to use the search(0) phase as it's only using two tables and the search(0) phase requires at least three tables in the query:

SELECT * FROM Person.Person per
INNER JOIN Sales.Customer cus ON cus.PersonID = per.BusinessEntityID
OPTION (RECOMPILE);

The last time we ran the query (or we can look at the estimated execution plan) the output was the following:

End of simplification, time: 0.002 net: 0.002 total: 0 net: 0.002

end exploration, tasks: 57 no total cost time: 0.001 net: 0.001 total: 0 net: 0.004

end search(1),  cost: 3.99696 tasks: 211 time: 0.002 net: 0.002 total: 0 net: 0.006

End of post optimization rewrite, time: 0 net: 0 total: 0 net: 0.006

End of query plan compilation, time: 0 net: 0 total: 0 net: 0.007

This time though because we've disabled the search(1) phase we get this instead:

End of simplification, time: 0 net: 0 total: 0 net: 0

end exploration, tasks: 57 no total cost time: 0 net: 0 total: 0 net: 0.001

end search(2),  cost: 3.99696 tasks: 211 time: 0.001 net: 0.001 total: 0 net: 0.002

End of post optimization rewrite, time: 0 net: 0 total: 0 net: 0.002

End of query plan compilation, time: 0 net: 0 total: 0 net: 0.003

We can see the trace flag has worked, search(1) has been skipped and instead it has used the search(2) optimisation phase and apart from a very slight difference in compilation time the result is exactly the same which is no surprise as the search(2) phase contains all the transformation rules that can be used by the optimiser.

Let's try the query that is using three tables: 

SELECT * FROM Person.Person per
INNER JOIN Sales.Customer cus ON cus.PersonID = per.BusinessEntityID
INNER JOIN Sales.SalesOrderHeader soh ON soh.CustomerID = cus.CustomerID 
OPTION (RECOMPILE);

End of simplification, time: 0 net: 0 total: 0 net: 0

end exploration, tasks: 71 no total cost time: 0.001 net: 0.001 total: 0 net: 0.001

end search(0),  cost: 14.5642 tasks: 386 time: 0.001 net: 0.001 total: 0 net: 0.003

end exploration, tasks: 582 Cost = 14.5642 time: 0 net: 0 total: 0 net: 0.003

end search(2),  cost: 6.57863 tasks: 1059 time: 0.002 net: 0.002 total: 0 net: 0.006

End of post optimization rewrite, time: 0 net: 0 total: 0 net: 0.006

End of query plan compilation, time: 0 net: 0 total: 0 net: 0.007

Again we see the same as the previous post, the optimiser has used two phases and we can see the cost improvement gained between the search(0) and this time the search(2) phase. But at this point I'm wondering that if we can skip search(1), can we disable the others as well?

Yes we can (insert evil laugh here).

Trace flag 8677 also some, 8670 to skip search(2) and 8750 to skip search(0). Again, don't do this...ever!! Let's start with 8670:

DBCC TRACEON(8670, -1); 

As expected, because SQL won't even give me a warning, probably because it thinks nobody in their right mind would do this, the search(2) has been skipped which has resulted in a query that has a higher cost than if it could have used either search(1) or search(2):

End of simplification, time: 0.001 net: 0.001 total: 0 net: 0.001

end exploration, tasks: 71 no total cost time: 0.001 net: 0.001 total: 0 net: 0.002

end search(0),  cost: 14.5642 tasks: 386 time: 0.001 net: 0.001 total: 0 net: 0.003

End of post optimization rewrite, time: 0 net: 0 total: 0 net: 0.004

End of query plan compilation, time: 0.001 net: 0.001 total: 0 net: 0.006

So what happens if we disable search(0) I can hear you asking?? 

DBCC TRACEON(8750, -1); 

And when I try to run the query again:

Msg 8624, Level 16, State 1, Line 8
Internal Query Processor Error: The query processor could not produce a query plan. For more information, contact Customer Support Services.

Well it seems we've broken it (you notice I've used "we've" there) - query optimisation has essentially been disabled, as we say in Yorkshire, it's well and truly knackered.

If anything it's shown how easy it can be to enable a couple of undocumented things and cause no end of problems and a SQL Server that isn't working any more. Interestingly enough even if I remove the OPTION (RECOMPILE) which should retrieve the query from cache I still get the error. In order to set things right in the universe I'll quickly disable those trace flags again:

DBCC TRACEOFF(8677, -1); 
DBCC TRACEOFF(8670, -1); 
DBCC TRACEOFF(8750, -1);
DBCC TRACEOFF(8675, -1); 
DBCC TRACEOFF(3604, -1);

Watching Optimisation Phases with Trace Flag 8675

When a query is executed a few things happen; the query is parsed, a logical tree of the query is created, the query is simplified and then the optimiser determines if a trivial plan can be used. If a trivial plan can't be used, essentially if the optimiser has to figure out the optimal execution plan (such as making decisions on joins) then the query is passed to the full optimisation stage.

The full optimisation stage is where the optimiser uses a bag of tricks to optimise our query (surprise, surpise), well technically it has three bags of tricks that are named optimisation phases that each contain a collection of transformation rules (which I cover in this post that you should never do). The optimiser is not limited to using just one of the phases and each has a set criteria which determines if the optimiser can use that particular phase.

In order to see what how the optimiser is using these phases we need to enable Trace Flag 8675 as well as Trace Flag 3604 which will redirect the output to the query messages tab in Management Studio:

DBCC TRACEON(8675, -1); 

DBCC TRACEON(3604, -1);

Let's start with a very straightforward query on the AdventureWorks2019 sample database and for each of the queries we can actually run them or show the estimated execution plan which will still use the query optimisation process:

SELECT * FROM Person.Person OPTION (RECOMPILE);

Which when I check the output in the messages tab I can the following:

End of simplification, time: 0 net: 0 total: 0 net: 0

End of post optimization rewrite, time: 0 net: 0 total: 0 net: 0.001

End of query plan compilation, time: 0 net: 0 total: 0 net: 0.001

For this query we don't see any information relating to optimisation phases, that's expected because for this query the optimiser has used a trivial plan, which we can check in the plan properties:


If we re-run the query excluding the OPTION (RECOMPLILE) and again check the messages tab we don't see anything in the messages tab this time. This is because the query has been retrieved from the plan cache or procedure cache as it is also known. 

Now I'll add a join to the query to give the optimiser something to think about and use the RECOMPILE option to ensure the query is not retrieved from cache:

SELECT * FROM Person.Person per
INNER JOIN Sales.Customer cus ON cus.PersonID = per.BusinessEntityID
OPTION (RECOMPILE);

This time in the messages output we can see a line for search(1) which is also knowns as the Quick Plan optimisation phase:

End of simplification, time: 0.002 net: 0.002 total: 0 net: 0.002

end exploration, tasks: 57 no total cost time: 0.001 net: 0.001 total: 0 net: 0.004

end search(1),  cost: 3.99696 tasks: 211 time: 0.002 net: 0.002 total: 0 net: 0.006

End of post optimization rewrite, time: 0 net: 0 total: 0 net: 0.006

End of query plan compilation, time: 0 net: 0 total: 0 net: 0.007

Here we can see that the optimiser has produced an execution plan with a cost of 3.9969 and I can check that against either the estimated or actual plan in management studio by hovering over the left most operator (in this case the Select) and checking the Subtree cost, the below image is from the estimated plan showing the estimated cost to be 3.99696


In order to demonstrate the optimiser using multiple phases I'll add a join to another table. There is another optimisation phase, search(0) named Transaction Processing where the transformation rules are used for OLTP type queries. The optimiser didn't start with search(0) for our previous query because there has to be at least three tables used in the query in order for the optimiser to use this particular phase.

SELECT * FROM Person.Person per
INNER JOIN Sales.Customer cus ON cus.PersonID = per.BusinessEntityID
INNER JOIN Sales.SalesOrderHeader soh ON soh.CustomerID = cus.CustomerID 
OPTION (RECOMPILE);

Again using the RECOMPILE this time we're using three tables in the query (Person.Person, Sales.Customer and Sales.SalesOrderHeader) and if we look at the message output after the query has executed we can see the use of two optimisation phases; search(0) and search(1) and corresponding time that the optimiser has spent in each phase:

End of simplification, time: 0.004 net: 0.004 total: 0 net: 0.004

end exploration, tasks: 71 no total cost time: 0.005 net: 0.005 total: 0 net: 0.01

end search(0),  cost: 14.5642 tasks: 386 time: 0.008 net: 0.008 total: 0 net: 0.018

end exploration, tasks: 572 Cost = 14.5642 time: 0.002 net: 0.002 total: 0 net: 0.021

end search(1),  cost: 6.57863 tasks: 1027 time: 0.01 net: 0.01 total: 0 net: 0.031

End of post optimization rewrite, time: 0 net: 0 total: 0 net: 0.032

End of query plan compilation, time: 0.002 net: 0.002 total: 0 net: 0.035

The interesting thing here is the cost difference between the optimisation phases, the search(0) phase returned a query plan with a cost of 14.5642 however by using the search(1) phase the query has used a plan with an associated cost of 6.57863 which is a clear improvement.

There is another phase, stage(2) which is the full optimisation phase that contains every single transformation rule available. I'll cover that in a forthcoming post and in the meantime I'll write a horrible enough query to use that phase and we'll look into optimiser timeouts as well.

Trace Flag 8675 is one of those little known trace flags (IMO) that gives us some in-depth information on how the query optimiser is working. 

To finish off, to disable the trace flags I should now run the following: 

DBCC TRACEOFF(8675, -1); 

DBCC TRACEOFF(3604, -1);

Breaking up with SQL Server

I was inspired to write this after reading a post from Dave Mason regarding breaking up with Big Tech companies. Yet again I haven't wr...