	{"id":65787,"date":"2022-01-18T09:50:59","date_gmt":"2022-01-18T09:50:59","guid":{"rendered":"https:\/\/www.artefact.com\/?post_type=news&#038;p=65787"},"modified":"2024-09-20T17:45:48","modified_gmt":"2024-09-20T16:45:48","slug":"how-to-quickly-compare-two-datasets-using-a-generic-and-powerful-sql-query","status":"publish","type":"blog","link":"https:\/\/www.artefact.com\/de\/blog\/how-to-quickly-compare-two-datasets-using-a-generic-and-powerful-sql-query\/","title":{"rendered":"So vergleichen Sie schnell zwei data-Sets mit einer generischen und leistungsstarken SQL-Abfrage"},"content":{"rendered":"<p><div class=\"fusion-fullwidth fullwidth-box fusion-builder-row-1 fusion-flex-container nonhundred-percent-fullwidth non-hundred-percent-height-scrolling article-author\" style=\"--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-background-color:#ffffff;--awb-flex-wrap:wrap;\" ><div class=\"fusion-builder-row fusion-row fusion-flex-align-items-flex-start fusion-flex-content-wrap\" style=\"max-width:calc( 1440px + 20px );margin-left: calc(-20px \/ 2 );margin-right: calc(-20px \/ 2 );\"><div class=\"fusion-layout-column fusion_builder_column fusion-builder-column-0 fusion_builder_column_1_2 1_2 fusion-flex-column\" style=\"--awb-bg-size:cover;--awb-width-large:50%;--awb-margin-top-large:0px;--awb-spacing-right-large:10px;--awb-margin-bottom-large:0px;--awb-spacing-left-large:10px;--awb-width-medium:50%;--awb-order-medium:0;--awb-spacing-right-medium:10px;--awb-spacing-left-medium:10px;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:10px;--awb-spacing-left-small:10px;\"><div class=\"fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-flex-start fusion-content-layout-column\"><div class=\"fusion-title title fusion-title-1 fusion-sep-none fusion-title-text fusion-title-size-two\" style=\"--awb-margin-bottom-small:8px;\"><h2 class=\"fusion-title-heading title-heading-left fusion-responsive-typography-calculated\" style=\"margin:0;--fontSize:50;line-height:1.2;\">Autor<\/h2><\/div><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns%3D%27http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%27%20width%3D%27150%27%20height%3D%270%27%20viewBox%3D%270%200%20150%200%27%3E%3Crect%20width%3D%27150%27%20height%3D%270%27%20fill-opacity%3D%220%22%2F%3E%3C%2Fsvg%3E\" data-orig-src=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2022\/01\/1560114886665.jpeg\" alt=\"Image\" class=\"lazyload artefact-elegant-image align-left article-author-image\" style=\"width: 150px; border-radius: 54% 46% 77% 23% \/ 74% 40% 60% 26%; overflow: hidden;\" width=\"150\" height=\"auto\" \/><div class=\"fusion-title title fusion-title-2 fusion-sep-none fusion-title-text fusion-title-size-three article-author-name-title\" style=\"--awb-margin-bottom-small:8px;\"><h3 class=\"fusion-title-heading title-heading-left fusion-responsive-typography-calculated\" style=\"margin:0;--fontSize:20;line-height:1.2;\">K\u00e9vin Lef\u00e8vre<\/h3><\/div><div class=\"fusion-text fusion-text-1 article-author-description\"><p>Data Entwickler bei Artefact<\/p>\n<\/div><\/div><\/div><\/div><\/div><div class=\"fusion-fullwidth fullwidth-box fusion-builder-row-2 fusion-flex-container nonhundred-percent-fullwidth non-hundred-percent-height-scrolling\" style=\"--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-margin-top:40px;--awb-margin-bottom:40px;--awb-flex-wrap:wrap;\" ><div class=\"fusion-builder-row fusion-row fusion-flex-align-items-center fusion-flex-justify-content-center fusion-flex-content-wrap\" style=\"max-width:calc( 1440px + 20px );margin-left: calc(-20px \/ 2 );margin-right: calc(-20px \/ 2 );\"><div class=\"fusion-layout-column fusion_builder_column fusion-builder-column-1 fusion_builder_column_1_1 1_1 fusion-flex-column fusion-flex-align-self-center fusion-column-inner-bg-wrapper\" style=\"--awb-padding-top:20px;--awb-padding-right:20px;--awb-padding-bottom:20px;--awb-padding-left:20px;--awb-overflow:hidden;--awb-inner-bg-size:cover;--awb-border-color:rgba(10,17,40,0.1);--awb-border-top:1px;--awb-border-right:1px;--awb-border-bottom:1px;--awb-border-left:1px;--awb-border-style:solid;--awb-border-radius:4px 4px 4px 4px;--awb-inner-bg-border-radius:4px 4px 4px 4px;--awb-inner-bg-overflow:hidden;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:10px;--awb-margin-bottom-large:0px;--awb-spacing-left-large:10px;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:10px;--awb-spacing-left-medium:10px;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:10px;--awb-spacing-left-small:10px;\"><span class=\"fusion-column-inner-bg hover-type-none\"><a class=\"fusion-column-anchor\" href=\"https:\/\/medium.com\/artefact-engineering-and-data-science\/scoring-customer-propensity-using-machine-learning-models-on-google-analytics-data-ba1126469c1f\" rel=\"noopener noreferrer\" target=\"_blank\"><span class=\"fusion-column-inner-bg-image\"><\/span><\/a><\/span><div class=\"fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-center fusion-content-layout-row fusion-flex-align-items-center\"><div class=\"fusion-text fusion-text-2\"><p><u>Lesen Sie unseren Artikel \u00fcber<\/u><\/p>\n<\/div><div class=\"fusion-image-element\" style=\"--awb-margin-right:20px;--awb-margin-left:20px;--awb-max-width:150px;--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);\"><span class=\"fusion-imageframe imageframe-none imageframe-1 hover-type-none\"><img decoding=\"async\" width=\"4000\" height=\"992\" title=\"Mittel Blog\" src=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/04\/Medium-Blog.png\" data-orig-src=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/04\/Medium-Blog.png\" alt class=\"lazyload img-responsive wp-image-60582\" srcset=\"data:image\/svg+xml,%3Csvg%20xmlns%3D%27http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%27%20width%3D%274000%27%20height%3D%27992%27%20viewBox%3D%270%200%204000%20992%27%3E%3Crect%20width%3D%274000%27%20height%3D%27992%27%20fill-opacity%3D%220%22%2F%3E%3C%2Fsvg%3E\" data-srcset=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/04\/Medium-Blog-200x50.png 200w, https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/04\/Medium-Blog-400x99.png 400w, https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/04\/Medium-Blog-600x149.png 600w, https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/04\/Medium-Blog-800x198.png 800w, https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/04\/Medium-Blog-1200x298.png 1200w, https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/04\/Medium-Blog.png 4000w\" data-sizes=\"auto\" data-orig-sizes=\"(max-width: 640px) 100vw, 4000px\" \/><\/span><\/div><div class=\"fusion-text fusion-text-3\"><p>.<\/p>\n<\/div><\/div><\/div><\/div><\/div><div class=\"fusion-fullwidth fullwidth-box fusion-builder-row-3 fusion-flex-container nonhundred-percent-fullwidth non-hundred-percent-height-scrolling\" style=\"--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-flex-wrap:wrap;\" ><div class=\"fusion-builder-row fusion-row fusion-flex-align-items-flex-start fusion-flex-content-wrap\" style=\"max-width:calc( 1440px + 20px );margin-left: calc(-20px \/ 2 );margin-right: calc(-20px \/ 2 );\"><div class=\"fusion-layout-column fusion_builder_column fusion-builder-column-2 fusion_builder_column_1_1 1_1 fusion-flex-column\" style=\"--awb-bg-size:cover;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:10px;--awb-margin-bottom-large:0px;--awb-spacing-left-large:10px;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:10px;--awb-spacing-left-medium:10px;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:10px;--awb-spacing-left-small:10px;\"><div class=\"fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-flex-start fusion-content-layout-column\"><div class=\"fusion-text fusion-text-4 description\"><p>Bei ihrer t\u00e4glichen Arbeit m\u00fcssen Data Engineers und Data-Analysten die data-Ingestionsprozesse verbessern. \u00dcber die \u00fcblichen Unit-Tests hinaus kann es interessant sein, zwei data-Sets (d.h. Tabellen, Views, Abfragen usw.) f\u00fcr verschiedene Zwecke wie Auswirkungsanalysen oder Nicht-Regressionstests einfach und schnell zu vergleichen. Auch die Identifizierung von Diskrepanzen zwischen zwei Standpunkten einer Snapshot-Tabelle ist f\u00fcr Ad-hoc-Analysen oder die Fehlersuche sehr n\u00fctzlich.<\/p>\n<p>Vor diesem Hintergrund und f\u00fcr SQL-Anwender ist ein vorgefertigtes und wiederverwendbares Skript sinnvoll, daher der Zweck dieses Artikels.<\/p>\n<\/div><\/div><\/div><\/div><\/div><article class=\"fusion-fullwidth fullwidth-box fusion-builder-row-4 fusion-flex-container nonhundred-percent-fullwidth non-hundred-percent-height-scrolling\" style=\"--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-flex-wrap:wrap;\" ><div class=\"fusion-builder-row fusion-row fusion-flex-align-items-flex-start fusion-flex-justify-content-center fusion-flex-content-wrap\" style=\"max-width:calc( 1440px + 20px );margin-left: calc(-20px \/ 2 );margin-right: calc(-20px \/ 2 );\"><div class=\"fusion-layout-column fusion_builder_column fusion-builder-column-3 fusion_builder_column_1_1 1_1 fusion-flex-column\" style=\"--awb-bg-size:cover;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:10px;--awb-margin-bottom-large:0px;--awb-spacing-left-large:10px;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:10px;--awb-spacing-left-medium:10px;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:10px;--awb-spacing-left-small:10px;\"><div class=\"fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-flex-start fusion-content-layout-column\"><div class=\"fusion-title title fusion-title-3 fusion-sep-none fusion-title-text fusion-title-size-two\" style=\"--awb-margin-bottom-small:8px;\"><h2 class=\"fusion-title-heading title-heading-left fusion-responsive-typography-calculated\" style=\"margin:0;--fontSize:50;line-height:1.2;\">Gebrauchsfertige SQL-Vorlage erkl\u00e4rt<\/h2><\/div><div class=\"fusion-text fusion-text-5\"><p>Lassen Sie uns mit der erwarteten Anfrage zum Kern der Sache kommen:<\/p>\n<\/div><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns%3D%27http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%27%20width%3D%27619%27%20height%3D%270%27%20viewBox%3D%270%200%20619%200%27%3E%3Crect%20width%3D%27619%27%20height%3D%270%27%20fill-opacity%3D%220%22%2F%3E%3C%2Fsvg%3E\" data-orig-src=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2022\/01\/How-to-quickly-compare-two-datasets-using-a-generic-powerful-SQL-query-by-Ke\u0301vin-LEFEVRE-Artefact-Engineering-and-Data-Science-Jan-2022-Medium-18-01-2022-10-00-45-1024x789.png\" alt=\"Image\" class=\"lazyload artefact-elegant-image align-left hover-enable\" style=\"width: 619px; border-radius: 59% 41% 41% 59% \/ 29% 48% 52% 71%; overflow: hidden;\" width=\"619\" height=\"auto\" \/><div class=\"fusion-text fusion-text-6\"><p>Lassen Sie es uns erkl\u00e4ren:<\/p>\n<\/div><ul style=\"--awb-line-height:27.2px;--awb-icon-width:27.2px;--awb-icon-height:27.2px;--awb-icon-margin:11.2px;--awb-content-margin:38.4px;\" class=\"fusion-checklist fusion-checklist-1 fusion-checklist-default type-icons paddingList dark-text\"><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Zun\u00e4chst verwenden wir CTE - Common Table Expressions (WITH ... ASstatements), um die beiden zu vergleichenden datasets zu identifizieren. Dadurch k\u00f6nnen die folgenden Abfragezeilen unver\u00e4ndert bleiben.<\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Dann konstruieren wir zwei \u00e4hnliche Abfragen mit den Aliasen dataset_1_only und dataset_2_only, um Zeilen zu behalten, die nur in dataset 1 vorhanden sind (im Vergleich zu dataset 2) und umgekehrt dank des EXCEPT-Set-Operators.<\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Schlie\u00dflich wird ein Flag-Komplement\u00e4rfeld definiert, um die Quelle der nicht \u00fcbereinstimmenden Zeilen aus beiden Abfragen zu identifizieren. Diese werden dann \u00fcber den Operator UNION ALL gesammelt.<\/p>\n<\/div><\/li><\/ul><div class=\"fusion-text fusion-text-7\"><p><em>Hinweis: Voraussetzung f\u00fcr die Verwendung der obigen Abfrage ist nat\u00fcrlich, dass Sie zwei data-Sets mit vergleichbaren Feldern haben.<\/em><\/p>\n<\/div><div class=\"fusion-title title fusion-title-4 fusion-sep-none fusion-title-text fusion-title-size-two\" style=\"--awb-margin-bottom-small:8px;\"><h2 class=\"fusion-title-heading title-heading-left fusion-responsive-typography-calculated\" style=\"margin:0;--fontSize:50;line-height:1.2;\">Illustriertes Beispiel<\/h2><\/div><div class=\"fusion-text fusion-text-8\"><p>Genug der Theorie, jetzt zur Praxis! Angenommen, ein Supermarkt m\u00f6chte <strong>alle Produkte abrufen<\/strong> f\u00fcr eine Bef\u00f6rderung in einer einzigen Tabelle je nach Woche in Frage kommen <strong>Bef\u00f6rderungsbedingungen<\/strong>.<\/p>\n<p>Als erste Aussage betrachten wir die folgenden Produkte auf Lager:<\/p>\n<\/div><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns%3D%27http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%27%20width%3D%27587%27%20height%3D%270%27%20viewBox%3D%270%200%20587%200%27%3E%3Crect%20width%3D%27587%27%20height%3D%270%27%20fill-opacity%3D%220%22%2F%3E%3C%2Fsvg%3E\" data-orig-src=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2022\/01\/How-to-quickly-compare-two-datasets-using-a-generic-powerful-SQL-query-by-Ke\u0301vin-LEFEVRE-Artefact-Engineering-and-Data-Science-Jan-2022-Medium-18-01-2022-10-12-30-1024x326.png\" alt=\"Image\" class=\"lazyload artefact-elegant-image align-left hover-enable\" style=\"width: 587px; border-radius: 59% 41% 41% 59% \/ 29% 48% 52% 71%; overflow: hidden;\" width=\"587\" height=\"auto\" \/><div class=\"fusion-text fusion-text-9\"><p>Gem\u00e4\u00df den Bef\u00f6rderungsbedingungen in Woche 1:<\/p>\n<\/div><ul style=\"--awb-line-height:27.2px;--awb-icon-width:27.2px;--awb-icon-height:27.2px;--awb-icon-margin:11.2px;--awb-content-margin:38.4px;\" class=\"fusion-checklist fusion-checklist-2 fusion-checklist-default type-icons paddingList dark-text\"><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p><em>10% Rabatt auf Obst-, Gem\u00fcse- und Kr\u00e4uterregale<\/em><\/p>\n<\/div><\/li><\/ul><div class=\"fusion-text fusion-text-10\"><p>das dataset_1 ist:<\/p>\n<\/div><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns%3D%27http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%27%20width%3D%27598%27%20height%3D%270%27%20viewBox%3D%270%200%20598%200%27%3E%3Crect%20width%3D%27598%27%20height%3D%270%27%20fill-opacity%3D%220%22%2F%3E%3C%2Fsvg%3E\" data-orig-src=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2022\/01\/How-to-quickly-compare-two-datasets-using-a-generic-powerful-SQL-query-by-Ke\u0301vin-LEFEVRE-Artefact-Engineering-and-Data-Science-Jan-2022-Medium-18-01-2022-10-15-18-1024x304.png\" alt=\"Image\" class=\"lazyload artefact-elegant-image align-left hover-enable\" style=\"width: 598px; border-radius: 59% 41% 41% 59% \/ 29% 48% 52% 71%; overflow: hidden;\" width=\"598\" height=\"auto\" \/><div class=\"fusion-text fusion-text-11\"><p>In Woche 2 \u00e4ndern sich die Bef\u00f6rderungsbedingungen zu:<\/p>\n<\/div><ul style=\"--awb-line-height:27.2px;--awb-icon-width:27.2px;--awb-icon-height:27.2px;--awb-icon-margin:11.2px;--awb-content-margin:38.4px;\" class=\"fusion-checklist fusion-checklist-3 fusion-checklist-default type-icons paddingList dark-text\"><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p><em>10% Rabatt auf Obst- und Eiscreme-Regale<\/em><\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p><em>20% Rabatt auf Gem\u00fcse Regal<\/em><\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p><em>kein Rabatt im Kr\u00e4uterregal<\/em><\/p>\n<\/div><\/li><\/ul><div class=\"fusion-text fusion-text-12\"><p>Die Pipeline-Evolution data erzeugt das dataset_2:<\/p>\n<\/div><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns%3D%27http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%27%20width%3D%27598%27%20height%3D%270%27%20viewBox%3D%270%200%20598%200%27%3E%3Crect%20width%3D%27598%27%20height%3D%270%27%20fill-opacity%3D%220%22%2F%3E%3C%2Fsvg%3E\" data-orig-src=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2022\/01\/How-to-quickly-compare-two-datasets-using-a-generic-powerful-SQL-query-by-Ke\u0301vin-LEFEVRE-Artefact-Engineering-and-Data-Science-Jan-2022-Medium-18-01-2022-10-19-04-1024x302.png\" alt=\"Image\" class=\"lazyload artefact-elegant-image align-left hover-enable\" style=\"width: 598px; border-radius: 59% 41% 41% 59% \/ 29% 48% 52% 71%; overflow: hidden;\" width=\"598\" height=\"auto\" \/><div class=\"fusion-text fusion-text-13\"><p>Aber... sind wir wirklich von den Ergebnissen \u00fcberzeugt? Typische Fragen sind:<\/p>\n<\/div><ul style=\"--awb-line-height:27.2px;--awb-icon-width:27.2px;--awb-icon-height:27.2px;--awb-icon-margin:11.2px;--awb-content-margin:38.4px;\" class=\"fusion-checklist fusion-checklist-4 fusion-checklist-default type-icons\"><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Gibt es fehlende oder zus\u00e4tzliche Zeilen?<\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Sind die implementierten Entwicklungen auf den betroffenen Feldern korrekt?<\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Gibt es eine Regression auf nicht betroffenen Feldern?<\/p>\n<\/div><\/li><\/ul><div class=\"fusion-text fusion-text-14\"><p><em>Hinweis: Auch wenn es aufgrund eines trivialen Beispiels mit einem kleinen dataset einfach erscheint, die obigen Fragen intuitiv zu beantworten, haben wir es in realen Anwendungsf\u00e4llen in der Regel mit einer Vielzahl von Zeilen und Spalten aus komplexen Abfragen zu tun (Transformationen, Joins, Aggregate, Windows-Funktionen, ...), die dieser Abfrage ihre volle Bedeutung verleihen<\/em><\/p>\n<\/div><div class=\"fusion-text fusion-text-15\"><p>Lassen Sie uns versuchen, die 3 Fragen zu beantworten, indem wir die Ergebnisse der Vergleichsabfrage interpretieren:<\/p>\n<\/div><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns%3D%27http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%27%20width%3D%27619%27%20height%3D%270%27%20viewBox%3D%270%200%20619%200%27%3E%3Crect%20width%3D%27619%27%20height%3D%270%27%20fill-opacity%3D%220%22%2F%3E%3C%2Fsvg%3E\" data-orig-src=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2022\/01\/How-to-quickly-compare-two-datasets-using-a-generic-powerful-SQL-query-by-Ke\u0301vin-LEFEVRE-Artefact-Engineering-and-Data-Science-Jan-2022-Medium-18-01-2022-10-24-02-1024x336.png\" alt=\"Image\" class=\"lazyload artefact-elegant-image align-left hover-enable\" style=\"width: 619px; border-radius: 59% 41% 41% 59% \/ 29% 48% 52% 71%; overflow: hidden;\" width=\"619\" height=\"auto\" \/><div class=\"fusion-text fusion-text-16\"><p>Beobachtungen:<\/p>\n<\/div><ul style=\"--awb-line-height:27.2px;--awb-icon-width:27.2px;--awb-icon-height:27.2px;--awb-icon-margin:11.2px;--awb-content-margin:38.4px;\" class=\"fusion-checklist fusion-checklist-5 fusion-checklist-default type-icons\"><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Keine Reihe mit Produkten aus dem Obstregal: <strong>gut<\/strong> als die \u201c<em>10% Rabatt auf Fr\u00fcchte<\/em>\u201d ist von Woche 1 zu Woche 2 unver\u00e4ndert geblieben<\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Die Produkt_ID #50 (Kr\u00e4uterregal) fehlt in dataset_2 (nur in dataset_1): <strong>gut<\/strong> als die \u201c<em>Rabatt auf Kr\u00e4uter<\/em>\u201d wurde in Woche 2 gestoppt<\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Die Produkt_ID #60 (Eiscreme-Regal) fehlt in dataset_1 (nur in dataset_2): <strong>gut<\/strong> als die \u201c<em>Rabatt auf Speiseeis<\/em>\u201d wurde in Woche 2 eingef\u00fchrt<\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Die Zeilen zur Product_ID #30 und #40 (Gem\u00fcseregal) weichen im Feld reduced_price voneinander ab: <strong>gut<\/strong> als die \u201c<em>Rabatt auf Gem\u00fcse<\/em>\u201d stieg von 10% in Woche 1 auf 20% in Woche 2<\/p>\n<\/div><\/li><\/ul><div class=\"fusion-text fusion-text-17\"><p>Endlich sieht alles gut aus:<\/p>\n<\/div><ul style=\"--awb-line-height:27.2px;--awb-icon-width:27.2px;--awb-icon-height:27.2px;--awb-icon-margin:11.2px;--awb-content-margin:38.4px;\" class=\"fusion-checklist fusion-checklist-6 fusion-checklist-default type-icons\"><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Gibt es fehlende oder zus\u00e4tzliche Zeilen?<strong> Nein!<\/strong><\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Sind die implementierten Entwicklungen auf den betroffenen Feldern korrekt? <strong>Ja!<\/strong><\/p>\n<\/div><\/li><li class=\"fusion-li-item\" style=\"\"><span class=\"icon-wrapper circle-no\"><i class=\"fusion-li-icon awb-icon-check\" aria-hidden=\"true\"><\/i><\/span><div class=\"fusion-li-item-content\">\n<p>Gibt es eine Regression auf nicht betroffenen Feldern? <strong>Nein!<\/strong><\/p>\n<\/div><\/li><\/ul><div class=\"fusion-title title fusion-title-5 fusion-sep-none fusion-title-text fusion-title-size-two\" style=\"--awb-margin-bottom-small:8px;\"><h2 class=\"fusion-title-heading title-heading-left fusion-responsive-typography-calculated\" style=\"margin:0;--fontSize:50;line-height:1.2;\">Tipps und Tricks<\/h2><\/div><div class=\"fusion-text fusion-text-18\"><p>Auch wenn Sie in CTE Ihre Ingestion-SQL-Abfrage vor\/nach der Evolution kopieren k\u00f6nnen, k\u00f6nnte es eine gute Idee sein, die <strong>speichern Sie die Ergebnisse in tempor\u00e4ren Tabellen<\/strong> um den Vergleich zu vereinfachen, die Leistung der Abfrage zu verbessern und von der Zwischenspeicherung zu profitieren, wenn Sie die Abfrage mehrmals ausf\u00fchren.<\/p>\n<p>In der Praxis k\u00f6nnen Divergenzen zwischen zwei data-Sets unsch\u00f6n sein. Dann, <strong>Ergebnisse sortieren<\/strong> nach Schl\u00fcssel (der Granularit\u00e4t der datasets) und Flagge kann bei der Interpretation und dem Vergleich gleichwertiger Zeilen aus beiden datasets enorm helfen.<\/p>\n<p>Um die Untersuchungen zu erleichtern und den Ursprung der L\u00fccke zu identifizieren, hat die <strong>Verd\u00e4chtig datasets<\/strong> <strong>gemeinsame Felder<\/strong> k\u00f6nnen aus dem Vergleich entfernt (d.h. in CTE kommentiert) werden: Wenn es kein Ergebnis gibt, bedeutet das, dass alle verglichenen Felder gleich sind. Sie k\u00f6nnen sich dann beim n\u00e4chsten Vergleich nur auf die verbleibenden verd\u00e4chtigen Felder konzentrieren und Schritt f\u00fcr Schritt vorgehen.<\/p>\n<\/div><div class=\"fusion-text fusion-text-19\"><p><em>Hinweis: Eine gleichwertige Analyse h\u00e4tte auch mit der Strategie LEFT JOIN durchgef\u00fchrt werden k\u00f6nnen, w\u00e4re aber viel schwieriger zu pflegen (Verwaltung von NULLs und Feldvergleichen) und weniger effizient gewesen, da Mengenoperatoren leistungsf\u00e4higer sind als Joins.<\/em><\/p>\n<\/div><div class=\"fusion-title title fusion-title-6 fusion-sep-none fusion-title-text fusion-title-size-two\" style=\"--awb-margin-bottom-small:8px;\"><h2 class=\"fusion-title-heading title-heading-left fusion-responsive-typography-calculated\" style=\"margin:0;--fontSize:50;line-height:1.2;\">Zusammenfassung<\/h2><\/div><div class=\"fusion-text fusion-text-20\"><p>Diese Abfragevorlage erleichtert es Entwicklern daher, \u00c4nderungen an komplexen data-Pipelines schnell zu validieren. Sie ist eine n\u00fctzliche Erg\u00e4nzung zu herk\u00f6mmlichen Unit-Tests und kann sogar allgemeiner betrachtet werden, wenn zwei data-Sets mit \u00e4hnlichen Strukturen verglichen werden. Und nicht zuletzt ist die Logik dieser Abfrage sogar leicht auswendig zu lernen!<\/p>\n<\/div><div class=\"fusion-text fusion-text-21\"><p>Danke f\u00fcrs Lesen, ich hoffe, es war klar und ich w\u00fcrde mich \u00fcber Ihr Feedback freuen :)<\/p>\n<\/div><\/div><\/div><\/div><\/article><div class=\"fusion-fullwidth fullwidth-box fusion-builder-row-5 fusion-flex-container nonhundred-percent-fullwidth non-hundred-percent-height-scrolling\" style=\"--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-margin-top:40px;--awb-margin-bottom:40px;--awb-flex-wrap:wrap;\" ><div class=\"fusion-builder-row fusion-row fusion-flex-align-items-center fusion-flex-justify-content-center fusion-flex-content-wrap\" style=\"max-width:calc( 1440px + 20px );margin-left: calc(-20px \/ 2 );margin-right: calc(-20px \/ 2 );\"><div class=\"fusion-layout-column fusion_builder_column fusion-builder-column-4 fusion_builder_column_1_1 1_1 fusion-flex-column fusion-flex-align-self-center\" style=\"--awb-padding-top:40px;--awb-padding-right:40px;--awb-padding-bottom:40px;--awb-padding-left:40px;--awb-overflow:hidden;--awb-bg-position:left center;--awb-bg-size:cover;--awb-border-color:rgba(10,17,40,0.1);--awb-border-style:solid;--awb-border-radius:4px 4px 4px 4px;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:10px;--awb-margin-bottom-large:0px;--awb-spacing-left-large:10px;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:10px;--awb-spacing-left-medium:10px;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:10px;--awb-spacing-left-small:10px;\"><div class=\"fusion-column-wrapper lazyload fusion-column-has-shadow fusion-flex-justify-content-center fusion-content-layout-column fusion-column-has-bg-image\" data-bg-url=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/03\/background.jpg\" data-bg=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/03\/background.jpg\"><div class=\"fusion-image-element\" style=\"text-align:center;--awb-margin-right:20px;--awb-margin-left:20px;--awb-max-width:150px;--awb-caption-title-font-family:var(--h2_typography-font-family);--awb-caption-title-font-weight:var(--h2_typography-font-weight);--awb-caption-title-font-style:var(--h2_typography-font-style);--awb-caption-title-size:var(--h2_typography-font-size);--awb-caption-title-transform:var(--h2_typography-text-transform);--awb-caption-title-line-height:var(--h2_typography-line-height);--awb-caption-title-letter-spacing:var(--h2_typography-letter-spacing);\"><span class=\"fusion-imageframe imageframe-none imageframe-2 hover-type-none\"><img decoding=\"async\" width=\"72\" height=\"41\" title=\"mittel\" src=\"data:image\/svg+xml,%3Csvg%20xmlns%3D%27http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%27%20width%3D%2772%27%20height%3D%2741%27%20viewBox%3D%270%200%2072%2041%27%3E%3Crect%20width%3D%2772%27%20height%3D%2741%27%20fill-opacity%3D%220%22%2F%3E%3C%2Fsvg%3E\" data-orig-src=\"https:\/\/www.artefact.com\/\/wp-content\/uploads\/2021\/03\/medium.png\" alt class=\"lazyload img-responsive wp-image-60927\"\/><\/span><\/div><div class=\"fusion-title title fusion-title-7 fusion-sep-none fusion-title-center fusion-title-text fusion-title-size-three\" style=\"--awb-margin-top:20px;--awb-margin-bottom:0px;--awb-margin-bottom-small:8px;\"><h3 class=\"fusion-title-heading title-heading-center fusion-responsive-typography-calculated\" style=\"margin:0;--fontSize:20;line-height:1.2;\">Mittel Blog von Artefact.<\/h3><\/div><div class=\"fusion-text fusion-text-22\" style=\"--awb-content-alignment:center;\"><p>Dieser Artikel wurde urspr\u00fcnglich ver\u00f6ffentlicht auf <strong>Medium.com<\/strong>.<br \/>\nFolgen Sie uns auf unserem Medium Blog !<\/p>\n<\/div><div style=\"text-align:center;\"><a class=\"fusion-button button-flat button-medium button-default fusion-button-default button-1 fusion-button-default-span fusion-button-default-type\" target=\"_blank\" rel=\"noopener noreferrer\" href=\"https:\/\/medium.com\/artefact-engineering-and-data-science\/how-to-quickly-compare-two-datasets-using-a-generic-powerful-sql-query-7efe752332b7\"><span class=\"fusion-button-text awb-button__text awb-button__text--default\">Unseren Artikel lesen<\/span><\/a><\/div><\/div><\/div><\/div><\/div><\/p>","protected":false},"excerpt":{"rendered":"<p>Eine Schritt-f\u00fcr-Schritt-Anleitung, die den Vergleich von datasets \u00fcber eine gebrauchsfertige Structured Query Language-Vorlage erleichtert<\/p>","protected":false},"featured_media":68686,"parent":0,"template":"","meta":{"_acf_changed":false,"ep_exclude_from_search":false},"blog-category":[21939],"blog-language":[2991],"class_list":["post-65787","blog","type-blog","status-publish","has-post-thumbnail","hentry","blog-category-medium","blog-language-en"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.artefact.com\/de\/wp-json\/wp\/v2\/blog\/65787","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.artefact.com\/de\/wp-json\/wp\/v2\/blog"}],"about":[{"href":"https:\/\/www.artefact.com\/de\/wp-json\/wp\/v2\/types\/blog"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.artefact.com\/de\/wp-json\/wp\/v2\/media\/68686"}],"wp:attachment":[{"href":"https:\/\/www.artefact.com\/de\/wp-json\/wp\/v2\/media?parent=65787"}],"wp:term":[{"taxonomy":"blog-category","embeddable":true,"href":"https:\/\/www.artefact.com\/de\/wp-json\/wp\/v2\/blog-category?post=65787"},{"taxonomy":"blog-language","embeddable":true,"href":"https:\/\/www.artefact.com\/de\/wp-json\/wp\/v2\/blog-language?post=65787"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}