	{"id":65787,"date":"2022-01-18T09:50:59","date_gmt":"2022-01-18T09:50:59","guid":{"rendered":"https:\/\/www.artefact.com\/?post_type=news&#038;p=65787"},"modified":"2024-09-20T17:45:48","modified_gmt":"2024-09-20T16:45:48","slug":"how-to-quickly-compare-two-datasets-using-a-generic-and-powerful-sql-query","status":"publish","type":"blog","link":"https:\/\/www.artefact.com\/nl\/blog\/how-to-quickly-compare-two-datasets-using-a-generic-and-powerful-sql-query\/","title":{"rendered":"Hoe u snel twee datasets kunt vergelijken met behulp van een algemene &amp; krachtige SQL-query"},"content":{"rendered":"<p><div class=\"fusion-fullwidth fullwidth-box fusion-builder-row-1 fusion-flex-container nonhundred-percent-fullwidth non-hundred-percent-height-scrolling article-author\" style=\"--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-background-color:#ffffff;--awb-flex-wrap:wrap;\" ><div class=\"fusion-builder-row fusion-row fusion-flex-align-items-flex-start fusion-flex-content-wrap\" style=\"max-width:calc( 1440px + 20px );margin-left: calc(-20px \/ 2 );margin-right: calc(-20px \/ 2 );\"><div class=\"fusion-layout-column fusion_builder_column fusion-builder-column-0 fusion_builder_column_1_2 1_2 fusion-flex-column\" style=\"--awb-bg-size:cover;--awb-width-large:50%;--awb-margin-top-large:0px;--awb-spacing-right-large:10px;--awb-margin-bottom-large:0px;--awb-spacing-left-large:10px;--awb-width-medium:50%;--awb-order-medium:0;--awb-spacing-right-medium:10px;--awb-spacing-left-medium:10px;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:10px;--awb-spacing-left-small:10px;\"><div class=\"fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-flex-start fusion-content-layout-column\"><div class=\"fusion-title title fusion-title-1 fusion-sep-none fusion-title-text fusion-title-size-two\" style=\"--awb-margin-bottom-small:8px;\"><h2 class=\"fusion-title-heading title-heading-left fusion-responsive-typography-calculated\" style=\"margin:0;--fontSize:50;line-height:1.2;\">Auteur<\/h2><\/div><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns%3D%27http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%27%20width%3D%27150%27%20height%3D%270%27%20viewBox%3D%270%200%20150%200%27%3E%3Crect%20width%3D%27150%27%20height%3D%270%27%20fill-opacity%3D%220%22%2F%3E%3C%2Fsvg%3E\" data-orig-src=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2022\/01\/1560114886665.jpeg\" alt=\"Image\" class=\"lazyload artefact-elegant-image align-left article-author-image\" style=\"width: 150px; border-radius: 54% 46% 77% 23% \/ 74% 40% 60% 26%; overflow: hidden;\" width=\"150\" height=\"auto\" \/><div class=\"fusion-title title fusion-title-2 fusion-sep-none fusion-title-text fusion-title-size-three article-author-name-title\" style=\"--awb-margin-bottom-small:8px;\"><h3 class=\"fusion-title-heading title-heading-left fusion-responsive-typography-calculated\" style=\"margin:0;--fontSize:20;line-height:1.2;\">K\u00e9vin Lev\u00e8vre<\/h3><\/div><div class=\"fusion-text fusion-text-1 article-author-description\"><p>Data Ontwikkelaar bij Artefact<\/p>\n<\/div><\/div><\/div><\/div><\/div><div class=\"fusion-fullwidth fullwidth-box fusion-builder-row-2 fusion-flex-container nonhundred-percent-fullwidth non-hundred-percent-height-scrolling\" style=\"--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-margin-top:40px;--awb-margin-bottom:40px;--awb-flex-wrap:wrap;\" ><div class=\"fusion-builder-row fusion-row fusion-flex-align-items-center fusion-flex-justify-content-center fusion-flex-content-wrap\" style=\"max-width:calc( 1440px + 20px );margin-left: calc(-20px \/ 2 );margin-right: calc(-20px \/ 2 );\"><div class=\"fusion-layout-column fusion_builder_column fusion-builder-column-1 fusion_builder_column_1_1 1_1 fusion-flex-column fusion-flex-align-self-center fusion-column-inner-bg-wrapper\" style=\"--awb-padding-top:20px;--awb-padding-right:20px;--awb-padding-bottom:20px;--awb-padding-left:20px;--awb-overflow:hidden;--awb-inner-bg-size:cover;--awb-border-color:rgba(10,17,40,0.1);--awb-border-top:1px;--awb-border-right:1px;--awb-border-bottom:1px;--awb-border-left:1px;--awb-border-style:solid;--awb-border-radius:4px 4px 4px 4px;--awb-inner-bg-border-radius:4px 4px 4px 4px;--awb-inner-bg-overflow:hidden;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:10px;--awb-margin-bottom-large:0px;--awb-spacing-left-large:10px;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:10px;--awb-spacing-left-medium:10px;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:10px;--awb-spacing-left-small:10px;\"><span class=\"fusion-column-inner-bg hover-type-none\"><a class=\"fusion-column-anchor\" href=\"https:\/\/medium.com\/artefact-engineering-and-data-science\/scoring-customer-propensity-using-machine-learning-models-on-google-analytics-data-ba1126469c1f\" rel=\"noopener noreferrer\" target=\"_blank\"><span class=\"fusion-column-inner-bg-image\"><\/span><\/a><\/span><div class=\"fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-center fusion-content-layout-row fusion-flex-align-items-center\"><div class=\"fusion-text fusion-text-2\"><p><u>Lees ons artikel over<\/u><\/p>\n<\/div><div class=\"fusion-image-element\" style=\"--awb-margin-right:20px;--awb-margin-left:20px;--awb-max-width:150px;--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);\"><span class=\"fusion-imageframe imageframe-none imageframe-1 hover-type-none\"><img decoding=\"async\" width=\"4000\" height=\"992\" title=\"Medium Blog\" src=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/04\/Medium-Blog.png\" data-orig-src=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/04\/Medium-Blog.png\" alt class=\"lazyload img-responsive wp-image-60582\" srcset=\"data:image\/svg+xml,%3Csvg%20xmlns%3D%27http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%27%20width%3D%274000%27%20height%3D%27992%27%20viewBox%3D%270%200%204000%20992%27%3E%3Crect%20width%3D%274000%27%20height%3D%27992%27%20fill-opacity%3D%220%22%2F%3E%3C%2Fsvg%3E\" data-srcset=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/04\/Medium-Blog-200x50.png 200w, https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/04\/Medium-Blog-400x99.png 400w, https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/04\/Medium-Blog-600x149.png 600w, https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/04\/Medium-Blog-800x198.png 800w, https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/04\/Medium-Blog-1200x298.png 1200w, https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/04\/Medium-Blog.png 4000w\" data-sizes=\"auto\" data-orig-sizes=\"(max-width: 640px) 100vw, 4000px\" \/><\/span><\/div><div class=\"fusion-text fusion-text-3\"><p>.<\/p>\n<\/div><\/div><\/div><\/div><\/div><div class=\"fusion-fullwidth fullwidth-box fusion-builder-row-3 fusion-flex-container nonhundred-percent-fullwidth non-hundred-percent-height-scrolling\" style=\"--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-flex-wrap:wrap;\" ><div class=\"fusion-builder-row fusion-row fusion-flex-align-items-flex-start fusion-flex-content-wrap\" style=\"max-width:calc( 1440px + 20px );margin-left: calc(-20px \/ 2 );margin-right: calc(-20px \/ 2 );\"><div class=\"fusion-layout-column fusion_builder_column fusion-builder-column-2 fusion_builder_column_1_1 1_1 fusion-flex-column\" style=\"--awb-bg-size:cover;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:10px;--awb-margin-bottom-large:0px;--awb-spacing-left-large:10px;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:10px;--awb-spacing-left-medium:10px;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:10px;--awb-spacing-left-small:10px;\"><div class=\"fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-flex-start fusion-content-layout-column\"><div class=\"fusion-text fusion-text-4 description\"><p>In hun dagelijkse werkzaamheden moeten Data Engineer's en Data Analisten data ingestion processen verbeteren. Naast de gebruikelijke unit tests, kan het interessant zijn om eenvoudig en snel twee datasets (d.w.z. tabellen, views, queries, etc) te vergelijken voor verschillende doeleinden, zoals impact analyse of non-regressie testen. Ook het identificeren van discrepanties tussen twee gezichtspunten van een snapshot-tabel is erg nuttig voor ad-hocanalyse of debuggen.<\/p>\n<p>Met dit in gedachten en voor SQL-beoefenaars is een kant-en-klaar en herbruikbaar script zinvol, vandaar het doel van dit artikel.<\/p>\n<\/div><\/div><\/div><\/div><\/div><article class=\"fusion-fullwidth fullwidth-box fusion-builder-row-4 fusion-flex-container nonhundred-percent-fullwidth non-hundred-percent-height-scrolling\" style=\"--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-flex-wrap:wrap;\" ><div class=\"fusion-builder-row fusion-row fusion-flex-align-items-flex-start fusion-flex-justify-content-center fusion-flex-content-wrap\" style=\"max-width:calc( 1440px + 20px );margin-left: calc(-20px \/ 2 );margin-right: calc(-20px \/ 2 );\"><div class=\"fusion-layout-column fusion_builder_column fusion-builder-column-3 fusion_builder_column_1_1 1_1 fusion-flex-column\" style=\"--awb-bg-size:cover;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:10px;--awb-margin-bottom-large:0px;--awb-spacing-left-large:10px;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:10px;--awb-spacing-left-medium:10px;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:10px;--awb-spacing-left-small:10px;\"><div class=\"fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-flex-start fusion-content-layout-column\"><div class=\"fusion-title title fusion-title-3 fusion-sep-none fusion-title-text fusion-title-size-two\" style=\"--awb-margin-bottom-small:8px;\"><h2 class=\"fusion-title-heading title-heading-left fusion-responsive-typography-calculated\" style=\"margin:0;--fontSize:50;line-height:1.2;\">Gebruiksklare SQL-sjabloon uitgelegd<\/h2><\/div><div class=\"fusion-text fusion-text-5\"><p>Laten we naar de kern van de zaak gaan met de verwachte vraag:<\/p>\n<\/div><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns%3D%27http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%27%20width%3D%27619%27%20height%3D%270%27%20viewBox%3D%270%200%20619%200%27%3E%3Crect%20width%3D%27619%27%20height%3D%270%27%20fill-opacity%3D%220%22%2F%3E%3C%2Fsvg%3E\" data-orig-src=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2022\/01\/How-to-quickly-compare-two-datasets-using-a-generic-powerful-SQL-query-by-Ke\u0301vin-LEFEVRE-Artefact-Engineering-and-Data-Science-Jan-2022-Medium-18-01-2022-10-00-45-1024x789.png\" alt=\"Image\" class=\"lazyload artefact-elegant-image align-left hover-enable\" style=\"width: 619px; border-radius: 59% 41% 41% 59% \/ 29% 48% 52% 71%; overflow: hidden;\" width=\"619\" height=\"auto\" \/><div class=\"fusion-text fusion-text-6\"><p>Laten we het nu uitleggen:<\/p>\n<\/div><ul style=\"--awb-line-height:27.2px;--awb-icon-width:27.2px;--awb-icon-height:27.2px;--awb-icon-margin:11.2px;--awb-content-margin:38.4px;\" class=\"fusion-checklist fusion-checklist-1 fusion-checklist-default type-icons paddingList dark-text\"><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Eerst gebruiken we CTE - Common Table Expressions (WITH ... ASstatements) om de twee te vergelijken datasets te identificeren. Hierdoor kunnen de volgende queryregels ongewijzigd blijven.<\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Vervolgens construeren we twee vergelijkbare queries met de alias dataset_1_only en dataset_2_only om rijen te behouden die alleen aanwezig zijn in dataset 1 (vergeleken met dataset 2) en vice versa dankzij de EXCEPT set operator.<\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Tot slot wordt er een aanvullend vlagveld gedefinieerd om de bron van niet overeenkomende rijen uit beide query's te identificeren. Deze worden dan verzameld via de UNION ALL set operator.<\/p>\n<\/div><\/li><\/ul><div class=\"fusion-text fusion-text-7\"><p><em>Opmerking: uiteraard is het hebben van twee datasets met vergelijkbare velden een voorwaarde om de bovenstaande query te kunnen gebruiken.<\/em><\/p>\n<\/div><div class=\"fusion-title title fusion-title-4 fusion-sep-none fusion-title-text fusion-title-size-two\" style=\"--awb-margin-bottom-small:8px;\"><h2 class=\"fusion-title-heading title-heading-left fusion-responsive-typography-calculated\" style=\"margin:0;--fontSize:50;line-height:1.2;\">Ge\u00efllustreerd voorbeeld<\/h2><\/div><div class=\"fusion-text fusion-text-8\"><p>Genoeg theorie, nu de praktijk! Stel dat een supermarkt <strong>alle producten ophalen<\/strong> die in aanmerking komen voor een promotie in een enkele tabel volgens de week <strong>promotievoorwaarden<\/strong>.<\/p>\n<p>In eerste instantie beschouwen we onderstaande producten als op voorraad:<\/p>\n<\/div><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns%3D%27http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%27%20width%3D%27587%27%20height%3D%270%27%20viewBox%3D%270%200%20587%200%27%3E%3Crect%20width%3D%27587%27%20height%3D%270%27%20fill-opacity%3D%220%22%2F%3E%3C%2Fsvg%3E\" data-orig-src=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2022\/01\/How-to-quickly-compare-two-datasets-using-a-generic-powerful-SQL-query-by-Ke\u0301vin-LEFEVRE-Artefact-Engineering-and-Data-Science-Jan-2022-Medium-18-01-2022-10-12-30-1024x326.png\" alt=\"Image\" class=\"lazyload artefact-elegant-image align-left hover-enable\" style=\"width: 587px; border-radius: 59% 41% 41% 59% \/ 29% 48% 52% 71%; overflow: hidden;\" width=\"587\" height=\"auto\" \/><div class=\"fusion-text fusion-text-9\"><p>Volgens de promotievoorwaarden in week 1:<\/p>\n<\/div><ul style=\"--awb-line-height:27.2px;--awb-icon-width:27.2px;--awb-icon-height:27.2px;--awb-icon-margin:11.2px;--awb-content-margin:38.4px;\" class=\"fusion-checklist fusion-checklist-2 fusion-checklist-default type-icons paddingList dark-text\"><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p><em>10% korting op fruit, groenten en kruiden schappen<\/em><\/p>\n<\/div><\/li><\/ul><div class=\"fusion-text fusion-text-10\"><p>de dataset_1 is:<\/p>\n<\/div><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns%3D%27http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%27%20width%3D%27598%27%20height%3D%270%27%20viewBox%3D%270%200%20598%200%27%3E%3Crect%20width%3D%27598%27%20height%3D%270%27%20fill-opacity%3D%220%22%2F%3E%3C%2Fsvg%3E\" data-orig-src=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2022\/01\/How-to-quickly-compare-two-datasets-using-a-generic-powerful-SQL-query-by-Ke\u0301vin-LEFEVRE-Artefact-Engineering-and-Data-Science-Jan-2022-Medium-18-01-2022-10-15-18-1024x304.png\" alt=\"Image\" class=\"lazyload artefact-elegant-image align-left hover-enable\" style=\"width: 598px; border-radius: 59% 41% 41% 59% \/ 29% 48% 52% 71%; overflow: hidden;\" width=\"598\" height=\"auto\" \/><div class=\"fusion-text fusion-text-11\"><p>In week 2 evolueren de promotievoorwaarden naar:<\/p>\n<\/div><ul style=\"--awb-line-height:27.2px;--awb-icon-width:27.2px;--awb-icon-height:27.2px;--awb-icon-margin:11.2px;--awb-content-margin:38.4px;\" class=\"fusion-checklist fusion-checklist-3 fusion-checklist-default type-icons paddingList dark-text\"><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p><em>10% korting op fruit en ijsjes schappen<\/em><\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p><em>20% korting op groenten schap<\/em><\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p><em>geen korting op kruidenrek<\/em><\/p>\n<\/div><\/li><\/ul><div class=\"fusion-text fusion-text-12\"><p>De evolutie van de data pijplijn genereert de dataset_2:<\/p>\n<\/div><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns%3D%27http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%27%20width%3D%27598%27%20height%3D%270%27%20viewBox%3D%270%200%20598%200%27%3E%3Crect%20width%3D%27598%27%20height%3D%270%27%20fill-opacity%3D%220%22%2F%3E%3C%2Fsvg%3E\" data-orig-src=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2022\/01\/How-to-quickly-compare-two-datasets-using-a-generic-powerful-SQL-query-by-Ke\u0301vin-LEFEVRE-Artefact-Engineering-and-Data-Science-Jan-2022-Medium-18-01-2022-10-19-04-1024x302.png\" alt=\"Image\" class=\"lazyload artefact-elegant-image align-left hover-enable\" style=\"width: 598px; border-radius: 59% 41% 41% 59% \/ 29% 48% 52% 71%; overflow: hidden;\" width=\"598\" height=\"auto\" \/><div class=\"fusion-text fusion-text-13\"><p>Maar... zijn we echt zeker van de resultaten? Typische vragen zijn:<\/p>\n<\/div><ul style=\"--awb-line-height:27.2px;--awb-icon-width:27.2px;--awb-icon-height:27.2px;--awb-icon-margin:11.2px;--awb-content-margin:38.4px;\" class=\"fusion-checklist fusion-checklist-4 fusion-checklist-default type-icons\"><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Zijn er ontbrekende of extra rijen?<\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Zijn ge\u00efmplementeerde evoluties correct op be\u00efnvloede velden?<\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Is er enige regressie op niet-be\u00efnvloede velden?<\/p>\n<\/div><\/li><\/ul><div class=\"fusion-text fusion-text-14\"><p><em>Opmerking: ook al lijkt het gemakkelijk om de bovenstaande vragen intu\u00eftief te beantwoorden dankzij een triviaal voorbeeld op een kleine dataset, in echte gebruiksgevallen hebben we meestal te maken met veel rijen en kolommen van complexe queries (transformaties, joins, aggregaten, windowsfuncties, ...) waardoor deze query zijn volledige betekenis krijgt.<\/em><\/p>\n<\/div><div class=\"fusion-text fusion-text-15\"><p>Laten we proberen de 3 vragen te beantwoorden door de resultaten van de vergelijkende query te interpreteren:<\/p>\n<\/div><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns%3D%27http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%27%20width%3D%27619%27%20height%3D%270%27%20viewBox%3D%270%200%20619%200%27%3E%3Crect%20width%3D%27619%27%20height%3D%270%27%20fill-opacity%3D%220%22%2F%3E%3C%2Fsvg%3E\" data-orig-src=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2022\/01\/How-to-quickly-compare-two-datasets-using-a-generic-powerful-SQL-query-by-Ke\u0301vin-LEFEVRE-Artefact-Engineering-and-Data-Science-Jan-2022-Medium-18-01-2022-10-24-02-1024x336.png\" alt=\"Image\" class=\"lazyload artefact-elegant-image align-left hover-enable\" style=\"width: 619px; border-radius: 59% 41% 41% 59% \/ 29% 48% 52% 71%; overflow: hidden;\" width=\"619\" height=\"auto\" \/><div class=\"fusion-text fusion-text-16\"><p>Opmerkingen:<\/p>\n<\/div><ul style=\"--awb-line-height:27.2px;--awb-icon-width:27.2px;--awb-icon-height:27.2px;--awb-icon-margin:11.2px;--awb-content-margin:38.4px;\" class=\"fusion-checklist fusion-checklist-5 fusion-checklist-default type-icons\"><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Geen rij met producten uit het fruitschap: <strong>fijn<\/strong> als de \u201c<em>10% korting op fruit<\/em>\u201d is ongewijzigd gebleven van week 1 naar week 2<\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Product_ID #50 (kruidenrek) ontbreekt in dataset_2 (alleen in dataset_1): <strong>fijn<\/strong> als de \u201c<em>korting op kruiden<\/em>\u201d werd gestopt in week 2<\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Product_ID #60 (ijsplank) ontbreekt in dataset_1 (alleen in dataset_2): <strong>fijn<\/strong> als de \u201c<em>korting op ijsjes<\/em>\u201d werd ge\u00efntroduceerd in week 2<\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Rijen met Product_ID #30 en #40 (groenteschap) wijken af op het veld gereduceerde_prijs: <strong>fijn<\/strong> als de \u201c<em>korting op groenten<\/em>\u201d steeg van 10% in week 1 naar 20% in week 2<\/p>\n<\/div><\/li><\/ul><div class=\"fusion-text fusion-text-17\"><p>Eindelijk ziet alles er goed uit:<\/p>\n<\/div><ul style=\"--awb-line-height:27.2px;--awb-icon-width:27.2px;--awb-icon-height:27.2px;--awb-icon-margin:11.2px;--awb-content-margin:38.4px;\" class=\"fusion-checklist fusion-checklist-6 fusion-checklist-default type-icons\"><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Zijn er ontbrekende of extra rijen?<strong> Nee!<\/strong><\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Zijn ge\u00efmplementeerde evoluties correct op be\u00efnvloede velden? <strong>Ja!<\/strong><\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Is er enige regressie op niet-be\u00efnvloede velden? <strong>Nee!<\/strong><\/p>\n<\/div><\/li><\/ul><div class=\"fusion-title title fusion-title-5 fusion-sep-none fusion-title-text fusion-title-size-two\" style=\"--awb-margin-bottom-small:8px;\"><h2 class=\"fusion-title-heading title-heading-left fusion-responsive-typography-calculated\" style=\"margin:0;--fontSize:50;line-height:1.2;\">Tips en trucs<\/h2><\/div><div class=\"fusion-text fusion-text-18\"><p>In CTE kan het, zelfs als u uw ingestion SQL-query voor\/na evolutie kunt kopi\u00ebren-plakken, een goed idee zijn om <strong>de resultaten opslaan in tijdelijke tabellen<\/strong> om de vergelijking te vereenvoudigen, de queryprestaties te verbeteren en te profiteren van caching als u de query meerdere keren uitvoert.<\/p>\n<p>In de echte wereld kunnen verschillen tussen twee datasets rommelig zijn. Dan, <strong>resultaten sorteren<\/strong> op sleutel (de granulariteit van de datasets) en vlag kan enorm helpen bij het interpreteren en vergelijken van gelijkwaardige rijen afkomstig van beide datasets.<\/p>\n<p>Om het onderzoek te vergemakkelijken en de oorsprong van de kloof te identificeren, zijn de <strong>verdacht datasets<\/strong> <strong>algemene velden<\/strong> kan verwijderd worden (d.w.z. becommentarieerd in CTE) uit de vergelijking: als er geen resultaat is, betekent dit dat alle vergeleken velden gelijk zijn. U kunt zich dan bij de volgende vergelijking alleen richten op de resterende verdachte velden en deze stap voor stap uitvoeren.<\/p>\n<\/div><div class=\"fusion-text fusion-text-19\"><p><em>Opmerking: een gelijkwaardige analyse had kunnen worden uitgevoerd met behulp van de LEFT JOIN-strategie, maar zou veel moeilijker te onderhouden zijn geweest (beheer van NULLs &amp; veldenvergelijkingen) en minder effici\u00ebnt omdat setoperators krachtiger zijn dan joins.<\/em><\/p>\n<\/div><div class=\"fusion-title title fusion-title-6 fusion-sep-none fusion-title-text fusion-title-size-two\" style=\"--awb-margin-bottom-small:8px;\"><h2 class=\"fusion-title-heading title-heading-left fusion-responsive-typography-calculated\" style=\"margin:0;--fontSize:50;line-height:1.2;\">Samenvatting<\/h2><\/div><div class=\"fusion-text fusion-text-20\"><p>Deze gesjabloneerde query maakt het daarom eenvoudiger voor ontwikkelaars om snel wijzigingen op complexe data-pijplijnen te valideren. Het is nuttig als aanvulling op meer traditionele unit tests, en kan zelfs op een meer algemene manier gebruikt worden bij het vergelijken van twee datasets met vergelijkbare structuren. En last but not least, de logica van deze query is zelfs gemakkelijk uit het hoofd te leren!<\/p>\n<\/div><div class=\"fusion-text fusion-text-21\"><p>Bedankt voor het lezen, ik hoop dat het duidelijk was en ik hoor graag uw feedback :)<\/p>\n<\/div><\/div><\/div><\/div><\/article><div class=\"fusion-fullwidth fullwidth-box fusion-builder-row-5 fusion-flex-container nonhundred-percent-fullwidth non-hundred-percent-height-scrolling\" style=\"--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-margin-top:40px;--awb-margin-bottom:40px;--awb-flex-wrap:wrap;\" ><div class=\"fusion-builder-row fusion-row fusion-flex-align-items-center fusion-flex-justify-content-center fusion-flex-content-wrap\" style=\"max-width:calc( 1440px + 20px );margin-left: calc(-20px \/ 2 );margin-right: calc(-20px \/ 2 );\"><div class=\"fusion-layout-column fusion_builder_column fusion-builder-column-4 fusion_builder_column_1_1 1_1 fusion-flex-column fusion-flex-align-self-center\" style=\"--awb-padding-top:40px;--awb-padding-right:40px;--awb-padding-bottom:40px;--awb-padding-left:40px;--awb-overflow:hidden;--awb-bg-position:left center;--awb-bg-size:cover;--awb-border-color:rgba(10,17,40,0.1);--awb-border-style:solid;--awb-border-radius:4px 4px 4px 4px;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:10px;--awb-margin-bottom-large:0px;--awb-spacing-left-large:10px;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:10px;--awb-spacing-left-medium:10px;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:10px;--awb-spacing-left-small:10px;\"><div class=\"fusion-column-wrapper lazyload fusion-column-has-shadow fusion-flex-justify-content-center fusion-content-layout-column fusion-column-has-bg-image\" data-bg-url=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/03\/background.jpg\" data-bg=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/03\/background.jpg\"><div class=\"fusion-image-element\" style=\"text-align:center;--awb-margin-right:20px;--awb-margin-left:20px;--awb-max-width:150px;--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);\"><span class=\"fusion-imageframe imageframe-none imageframe-2 hover-type-none\"><img decoding=\"async\" width=\"72\" height=\"41\" title=\"middelgrote\" src=\"data:image\/svg+xml,%3Csvg%20xmlns%3D%27http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%27%20width%3D%2772%27%20height%3D%2741%27%20viewBox%3D%270%200%2072%2041%27%3E%3Crect%20width%3D%2772%27%20height%3D%2741%27%20fill-opacity%3D%220%22%2F%3E%3C%2Fsvg%3E\" data-orig-src=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/03\/medium.png\" alt class=\"lazyload img-responsive wp-image-60927\"\/><\/span><\/div><div class=\"fusion-title title fusion-title-7 fusion-sep-none fusion-title-center fusion-title-text fusion-title-size-three\" style=\"--awb-margin-top:20px;--awb-margin-bottom:0px;--awb-margin-bottom-small:8px;\"><h3 class=\"fusion-title-heading title-heading-center fusion-responsive-typography-calculated\" style=\"margin:0;--fontSize:20;line-height:1.2;\">Medium Blog bij Artefact.<\/h3><\/div><div class=\"fusion-text fusion-text-22\" style=\"--awb-content-alignment:center;\"><p>Dit artikel werd oorspronkelijk gepubliceerd op <strong>Medium.com<\/strong>.<br \/>\nVolg ons op ons medium Blog !<\/p>\n<\/div><div style=\"text-align:center;\"><a class=\"fusion-button button-flat button-medium button-default fusion-button-default button-1 fusion-button-default-span fusion-button-default-type\" target=\"_blank\" rel=\"noopener noreferrer\" href=\"https:\/\/medium.com\/artefact-engineering-and-data-science\/how-to-quickly-compare-two-datasets-using-a-generic-powerful-sql-query-7efe752332b7\"><span class=\"fusion-button-text awb-button__text awb-button__text--default\">Lees ons artikel<\/span><\/a><\/div><\/div><\/div><\/div><\/div><\/p>","protected":false},"excerpt":{"rendered":"<p>Een stapsgewijze handleiding om het vergelijken van datasets te vergemakkelijken via een gebruiksklaar sjabloon voor Structured Query Language<\/p>","protected":false},"featured_media":68686,"parent":0,"template":"","meta":{"_acf_changed":false,"ep_exclude_from_search":false},"blog-category":[21939],"blog-language":[2991],"class_list":["post-65787","blog","type-blog","status-publish","has-post-thumbnail","hentry","blog-category-medium","blog-language-en"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.artefact.com\/nl\/wp-json\/wp\/v2\/blog\/65787","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.artefact.com\/nl\/wp-json\/wp\/v2\/blog"}],"about":[{"href":"https:\/\/www.artefact.com\/nl\/wp-json\/wp\/v2\/types\/blog"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.artefact.com\/nl\/wp-json\/wp\/v2\/media\/68686"}],"wp:attachment":[{"href":"https:\/\/www.artefact.com\/nl\/wp-json\/wp\/v2\/media?parent=65787"}],"wp:term":[{"taxonomy":"blog-category","embeddable":true,"href":"https:\/\/www.artefact.com\/nl\/wp-json\/wp\/v2\/blog-category?post=65787"},{"taxonomy":"blog-language","embeddable":true,"href":"https:\/\/www.artefact.com\/nl\/wp-json\/wp\/v2\/blog-language?post=65787"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}