The Definitive Guide to Azure Data Engineering: Modern ELT, DevOps, and Analytics on the Azure Cloud Platform
J**E
What a stinker
This book simply hasn't had any kind of technical review, despite a nod to some bozo called Greg Low at the start. If you want to get paid for doing nothing, speak to Greg about becoming a technical reviewer for Apress.Let me start by saying I've a 25 year career in software and database development, and over those years have read dozens and dozens of technical books, along with BOL and all the other technical support sites you'd expect. I'm not green and I'm not troubled when certain chapters/topics are left as an 'exercise to the reader'. It's worth saying that for background.First 3 chapters are background on Azure data warehousing services and are informative enough, but nothing you couldn't determine from reading Microsoft's Azure pages.Chapter 4 is where the fun begins. The objective of this chapter is to load ADLS Gen2 from a SQL database. Straight forward enough. It's here where you realise there's no code download for the book. Slightly annoying, but at this point, no biggy, we're only talking about a bit of basic DML to create a table and some Azure expressions to set folder locations. This becomes a lot more annoying later on when there are some fairly lengthy scripts that need typing out by hand. Why the hell couldn't you have provided these is a digital format to copy/paste and while you're at it, sample databases and files are you psychopaths??? What would have been helpful at this stage would have been a more detailed explanation of the expressions being used - difference between dataset() and item() etc. The expression on page 68 also contains an error, they forget to add the .parquet file extension to the file path on the sink dataset. Something that will come back an bite in the following chapter.Chapter 5 is an exploration of using COPY INTO to move data from ADLS Gen2 to a dedicated SQL pool. Couple of issues here. On page 87 the COPY INTO script will fail if you followed the instructions in the previous chapter as the files you've loaded into you lake are missing the .parquet extension, so the script can find any files that match the *.parquet pattern. OK, after a bit of head scratching, an amend to the previous pipeline, and a reload later, we have files with the right extension. The next issue with the code is with FILE_FORMAT and CREDENTIALS properties of the command. The FILE_FORMAT is defined as snappyparquet, but so far we have defined what snappyparquet is. What we're missing, I think is:CREATE EXTERNAL FILE FORMAT snappyparquetWITH ( FORMAT_TYPE = PARQUET ,DATA_COMPRESSION = 'org.apache.hadoop.io.compress.SnappyCodec' );Before the COPY INTO script. Again, annoying, but not the end of the world. The CREDENTIAL is set to use 'Managed Identity'. A brief discussion it this point into the different option here would have been useful. In the end I got this working by using:CREDENTIAL=(IDENTITY= 'Shared Access Signature', SECRET='SAS TOKEN')I didn't try the CSV option, I was too brassed off.Chapter 6 is an exploration of loading data from directly from ADLS Gen2 to a dedicated SQL pool. It starts by totally redefining a pipeline parameter table we used in Chapter 4, but with no explanation of what any of the columns mean or how it's meant to be used/populated. We then dive straight into using the table without any explanation of how! We have defined datasets with totally different ADLS paths to any of the previous ones used, using expressions like @{item().src_schema}/@{item().dst_name}!!?? There should be a reasonable explanation of how this table is going to be used and how to populate it so that we can get the samples running. I haven't been able to complete this chapter as there's too much missing information.That's as far as I've got so far. This is by far an away the worst technical resource I've ever had the misfortune to ready. It feel like it's been rushed to market with absolutely zero technical review otherwise they'd have realised the exercises are riddled with errors and therefore impossible to follow.If I can limp through any more of this I'll provide chapter updates. Really disappointed, I've got plenty of books by Apress and they're normally rock solid, but this is truly atrocious.
Y**A
Really badly written book
I was so hesitant to add a review as i do not like cutting off people's earning in any shape or form. But into the chapter 5, it took me sometimes too long to understand what he is trying to say and i have read loas of technical books but never found it this challenging in a way. Also, please try to list steps on how to configure or create something rather than expecting people to know the platform and by the way Azure platform interface and layout keep changing which is why it is worth directing the readers to what they should be clicking at for example (1- Find xyz, 2- Click on Z, 3- Select M ). I am at Chapter 5 and it should never take me 3 days to create a pipeline had the steps been clearly shown. i do not think this book can help me now. i will stick to videos. what a waste of money
Trustpilot
2 months ago
3 days ago