Published October 29, 2021 | Version v1
Publication Open

Examining data visualization pitfalls in scientific publications

  • 1. Thai Nguyen University
  • 2. Texas Tech University
  • 3. Meharry Medical College

Description

Abstract Data visualization blends art and science to convey stories from data via graphical representations. Considering different problems, applications, requirements, and design goals, it is challenging to combine these two components at their full force. While the art component involves creating visually appealing and easily interpreted graphics for users, the science component requires accurate representations of a large amount of input data. With a lack of the science component, visualization cannot serve its role of creating correct representations of the actual data, thus leading to wrong perception, interpretation, and decision. It might be even worse if incorrect visual representations were intentionally produced to deceive the viewers. To address common pitfalls in graphical representations, this paper focuses on identifying and understanding the root causes of misinformation in graphical representations. We reviewed the misleading data visualization examples in the scientific publications collected from indexing databases and then projected them onto the fundamental units of visual communication such as color, shape, size, and spatial orientation. Moreover, a text mining technique was applied to extract practical insights from common visualization pitfalls. Cochran's Q test and McNemar's test were conducted to examine if there is any difference in the proportions of common errors among color, shape, size, and spatial orientation. The findings showed that the pie chart is the most misused graphical representation, and size is the most critical issue. It was also observed that there were statistically significant differences in the proportion of errors among color, shape, size, and spatial orientation.

⚠️ This is an automatic machine translation with an accuracy of 90-95%

Translated Description (Arabic)

يمزج تصور البيانات المجردة بين الفن والعلوم لنقل القصص من البيانات عبر التمثيلات الرسومية. بالنظر إلى المشكلات والتطبيقات والمتطلبات وأهداف التصميم المختلفة، من الصعب الجمع بين هذين المكونين بكامل قوتهما. بينما يتضمن المكون الفني إنشاء رسومات جذابة بصريًا وسهلة التفسير للمستخدمين، يتطلب المكون العلمي تمثيلات دقيقة لكمية كبيرة من بيانات الإدخال. مع عدم وجود عنصر علمي، لا يمكن أن يخدم التصور دوره في إنشاء تمثيلات صحيحة للبيانات الفعلية، مما يؤدي إلى إدراك وتفسير وقرار خاطئ. قد يكون الأمر أسوأ إذا تم إنتاج تمثيلات مرئية غير صحيحة عمدًا لتخييب آمال المشاهدين. لمعالجة المزالق الشائعة في التمثيلات الرسومية، تركز هذه الورقة على تحديد وفهم الأسباب الجذرية للمعلومات الخاطئة في التمثيلات الرسومية. راجعنا أمثلة تصور البيانات المضللة في المنشورات العلمية التي تم جمعها من فهرسة قواعد البيانات ثم عرضناها على الوحدات الأساسية للتواصل المرئي مثل اللون والشكل والحجم والتوجه المكاني. علاوة على ذلك، تم تطبيق تقنية التنقيب عن النصوص لاستخراج الرؤى العملية من عثرات التصور الشائعة. تم إجراء اختبار كوكران Q واختبار ماكنمار McNemar لفحص ما إذا كان هناك أي اختلاف في نسب الأخطاء الشائعة بين اللون والشكل والحجم والتوجه المكاني. أظهرت النتائج أن المخطط الدائري هو التمثيل الرسومي الأكثر سوءًا، والحجم هو القضية الأكثر أهمية. كما لوحظ وجود فروق ذات دلالة إحصائية في نسبة الأخطاء بين اللون والشكل والحجم والتوجه المكاني.

Translated Description (English)

Abstract Data visualization blends art and science to convey stories from data via graphical representations. Considering different problems, applications, requirements, and design goals, it is challenging to combine these two components at their full force. While the art component involves creating visually appealing and easily interpreted graphics for users, the science component requires accurate representations of a large amount of input data. With a lack of the science component, visualization cannot serve its role of creating correct representations of the actual data, thus leading to wrong perception, interpretation, and decision. It might be even worse if incorrect visual representations were intentionally produced to disappoint the viewers. To address common pitfalls in graphical representations, this paper focuses on identifying and understanding the root causes of misinformation in graphical representations. We reviewed the misleading data visualization examples in the scientific publications collected from indexing databases and then projected them onto the fundamental units of visual communication such as color, shape, size, and spatial orientation. Moreover, a text mining technique was applied to extract practical insights from common visualization pitfalls. Cochran's Q test and McNemar's test were conducted to examine whether there is any difference in the proportions of common errors among color, shape, size, and spatial orientation. The findings showed that the pie chart is the most misused graphical representation, and size is the most critical issue. It was also observed that there were statistically significant differences in the proportion of errors among color, shape, size, and spatial orientation.

Translated Description (French)

Abstract Data visualization blends art and science to convey stories from data via graphical representations. Considering different problems, applications, requirements, and design goals, it is challenging to combine these two components at their full force. While the art component involves creating visually appealing and easily interpreted graphics for users, the science component requires exacte representations of a large amount of input data. With a lack of the science component, visualization cannot serve its role of creating correct representations of the actual data, thus leading to wrong perception, interpretation, and decision. It might be even worse if incorrect visual representations were intentionally produced to deceive the viewers. To address common pitfalls in graphical representations, this paper focuses on identifying and understanding the root causes of misinformation in graphical representations. We reviewed the misleading data visualization examples in the scientific publications collected from indexing databases and then projected them onto the fundamental units of visual communication such as color, shape, size, and spatial orientation. Moreover, a text mining technique was applied to extract practical insights from common visualization pitfalls. Cochran's Q test and McNemar's test were conducted to examine if there is any difference in the proportions of common errors among color, shape, size, and spatial orientation. The findings showed that the pie chart is the most misused graphical representation, and size is the most critical issue. It was also observed that there were statistically significant differences in the proportion of errors among color, shape, size, and spatial orientation.

Translated Description (Spanish)

Abstract Data visualization blends art and science to convey stories from data via graphical representations. Considering different problems, applications, requirements, and design goals, it is challenging to combine these two components at their full force. While the art component involves creating visually appealing and easily interpreted graphics for users, the science component requires accurate representations of a large amount of input data. With a lack of the science component, visualization cannot serve its role of creating correct representations of the actual data, thus leading to wrong perception, interpretation, and decision. It might be even worse if incorrect visual representations were intentionally produced to deceive the viewers. To address common pitfalls in graphical representations, this paper focuses on identifying and understanding the root causes of misinformation in graphical representations. We reviewed the misleading data visualization examples in the scientific publications collected from indexing databases and then projected them onto the fundamental units of visual communication such as color, shape, size, and spatial orientation. Moreover, a text mining technique was applied to extract practical insights from common visualization pitfalls. Cochran's Q test andMcNemar's test were conducted to examine if there is any difference in the proportions of common errors among color, shape, size, and spatial orientation. The findings showed that the pie chart is the most misused graphical representation, and size is the most critical issue. It was also observed that there were statistically significant differences in the proportion of errors among color, shape, size, and spatial orientation.

Files

s42492-021-00092-y.pdf

Files (4.0 MB)

⚠️ Please wait a few minutes before your translated files are ready ⚠️ Note: Some files might be protected thus translations might not work.
Name Size Download all
md5:851ef522c40a1ed20e8fa1bb63483891
4.0 MB
Preview Download

Additional details

Additional titles

Translated title (Arabic)
فحص مزالق تصور البيانات في المنشورات العلمية
Translated title (English)
Examining data visualization pitfalls in scientific publications
Translated title (French)
Examining data visualization pitfalls in scientific publications
Translated title (Spanish)
Examining data visualization pitfalls in scientific publications

Identifiers

Other
https://openalex.org/W3211294635
DOI
10.1186/s42492-021-00092-y

GreSIS Basics Section

Is Global South Knowledge
Yes
Country
Vietnam

References

  • https://openalex.org/W1553589896
  • https://openalex.org/W1977635018
  • https://openalex.org/W1978322172
  • https://openalex.org/W2009307530
  • https://openalex.org/W2031843365
  • https://openalex.org/W2064189581
  • https://openalex.org/W2081750676
  • https://openalex.org/W2098454324
  • https://openalex.org/W2117562290
  • https://openalex.org/W2134633031
  • https://openalex.org/W2140504904
  • https://openalex.org/W2141033859
  • https://openalex.org/W2217256191
  • https://openalex.org/W2488113179
  • https://openalex.org/W2840829458
  • https://openalex.org/W3107303755
  • https://openalex.org/W3114100656
  • https://openalex.org/W3186462502
  • https://openalex.org/W4241335564
  • https://openalex.org/W4243831978
  • https://openalex.org/W4294215472
  • https://openalex.org/W634106523
  • https://openalex.org/W73734048
  • https://openalex.org/W884650706