Overview

Total artworks
Loading...
Works by male artists
Loading...
Works by female artists
Loading...
Gender Data Complete
Loading...

Artists

Uneven representation – who gets collected?

Each bubble represents an artist. The horizontal position shows when the artist was born, the vertical position shows how many works by that artist are in the museum collection, and the bubble size reinforces the number of artworks. This reveals historical patterns in which artists and which eras, as well as which gender, the museum prioritizes.

Top 10 most collected artists

These rankings show which individual artists have the greatest representation in the collection, separated by gender. Comparing the #1 male artist to the #1 female artist reveals the scale difference in how deeply the museum has collected from leading artists of each gender.

Where are the artists from?

This map shows the geographic origins of artists in the SMK collection. Bubble size represents the number of unique artists from each country, and color indicates the dominant gender representation. Hover over bubbles to see detailed breakdowns.

Top nationalities by gender

This chart shows the top 10 nationalities in the collection, comparing male and female artist representation. Bars extend left for male artists and right for female artists, making gender disparities immediately visible. The length of each bar represents the number of unique artists from that nationality and gender.

When were the artists born?

This histogram shows the distribution of unique artists by birth year (grouped by decade), comparing male and female artists. This reveals which generations of artists are most represented in the collection and how gender representation varies across different birth cohorts.

When were the artworks created?

This histogram shows when the artworks in the collection were actually created (grouped by decade), comparing works by male and female artists. This reveals which artistic periods and movements are most represented in the collection and how gender representation varies across different eras of art history.

Acquisition

Gender distribution by year

Acquisition trend in the last 50 years

This chart tracks the percentage of female artists among annual acquisitions from 1975 to 2025. The horizontal line shows the overall collection average for comparison, revealing whether recent acquisitions are improving gender balance.

Acquisition lag: historical vs contemporary collecting

How long after creation are works acquired? This analysis examines the time gap between when a work was produced and when it entered the collection. A shorter lag suggests contemporary collecting (acquiring works from living or recently deceased artists), while longer lags indicate historical collecting (old masters). This reveals whether female artists are more likely to be collected during their lifetimes or posthumously rediscovered.

Years between production and acquisition

This chart shows the percentage of works in each lag category by gender. It reveals whether works by one gender are more likely to be acquired contemporaneously or after centuries.

Departments by gender

This Sankey diagram shows how artworks by different artist genders are distributed across museum departments. The flows connect artist genders (left) to the top 15 museum departments (right), with the width of each flow proportional to the number of artworks.

Artworks

Object types

Gender distribution within each object type (%)

Techniques

Gender distribution within each technique (%)

Materials

Gender distribution within each material (%)

Subjects

Who depicts whom? Creator vs depicted gender

This chart explores the relationship between artist gender and the gender of people depicted in portraits and figural works. Do male artists predominantly depict men or women? What about female artists?

Depicted locations

This analysis explores the geographic locations depicted in artworks, revealing patterns in artistic mobility and subject matter. The map shows suggested places of depiction or reference from the collection, allowing us to examine whether male and female artists focused on different geographic regions. For instance, whether female artists tended to paint local Danish scenes while male artists depicted more international locations.

Distance from home

This chart shows how far from Copenhagen (the museum's location) the depicted locations are, grouped by distance ranges. It reveals whether one gender's works tend to depict more local versus international scenes.

Characteristics

Color palette comparison

These treemaps show the actual colors used in artworks by male and female artists. Each rectangle represents a specific color from the SMK palette, with size indicating how frequently that color appears. Hover over colors to see their hex codes and usage statistics. The top 100 most common colors are shown for each gender.

Color distribution over time

These charts show how the distribution of 12 chromatic color families changed over time in artworks by male and female artists. The color families follow the traditional color wheel: 3 primary colors (red, yellow, blue), 3 secondary colors (orange, green, purple), and 6 tertiary colors (red-orange, yellow-orange, yellow-green, blue-green, blue-violet, red-violet). Each bar represents a decade, with color segments showing the percentage of artworks featuring each color family. Achromatic colors (black, white, gray) are excluded to highlight chromatic trends.

Male artists

Female artists

Average size of paintings

Do works by male artists tend to be physically larger? Physical size has historically been a proxy for perceived importance in art collections. Large-scale works often receive more prominent display locations and command greater attention. This analysis examines paintings in the collection to test whether there are systematic differences in dimensions by creator gender.

Average Dimensions (cm)

Average Area (cm²)

Size distribution (area in cm²)

This chart shows the percentage of paintings in each size category by gender. It reveals whether one gender's works are concentrated in smaller or larger size ranges.

Visibility

Exhibitions

These charts show exhibition patterns across the entire collection. The left chart shows the average number of times each artwork has been exhibited, while the right chart shows what percentage of works have been exhibited at least once.

Average exhibitions per artwork

Works exhibited at least once

Currently on display

This chart shows what percentage of artworks are currently on display versus not on display for each gender. Each bar represents 100% of the works for that gender, making it easy to compare display rates.

Digitization progress

This chart shows what percentage of artworks have been photographed/digitized by gender. The presence of a digital image often indicates that an artwork has been prioritized for documentation and online visibility, which can be a marker of institutional attention and perceived importance.

About

Project Overview

SMK Data Visualized is an interactive data exploration tool that analyzes gender representation in the collection of Denmark's National Gallery of Art (Statens Museum for Kunst). Using the museum's public API, this application processes nearly 200,000 artworks to reveal patterns in institutional collecting practices, artist demographics, and curatorial decisions across time.

The project presents 18 distinct visualization sections examining artist demographics, temporal acquisition patterns, geographic distribution, physical characteristics, color analysis, subject matter, and institutional visibility. Through statistical analysis and interactive charts, it makes complex museum collection data accessible and transparent, encouraging informed dialogue about representation in cultural institutions.

About Statens Museum for Kunst (SMK)

Statens Museum for Kunst (SMK) is Denmark's national gallery, located in Copenhagen. The museum houses the Danish national collection of art, spanning 700 years from the Middle Ages to contemporary works. The collection includes paintings, sculptures, prints, drawings, and decorative arts from Danish and international artists.

Creator

Created by Victor Nordquist. This is an independent project, not officially affiliated with Statens Museum for Kunst.

Technology & Open Source

This is a client-side web application built with modern JavaScript (ES6 modules), Chart.js, and D3.js. All data processing happens in your browser. The application uses IndexedDB for optional local caching (with your consent) to improve performance on repeat visits.

The complete source code is available on GitHub under an open source license. Contributions, suggestions, and feedback are welcome.

Gender Classification

How gender is determined: Gender is based on the creator_gender field in the SMK API, which reflects the museum's cataloguing. Values are normalized to "Male", "Female", or "Unknown". For depicted persons in artworks (portraits, etc.), gender is extracted from the content_person_full field when available.

Key Assumptions & Scope

  • One creator per work: Only the primary creator's gender is analyzed. Collaborative works are attributed to the first listed creator.
  • Acquisition date as proxy: Acquisition dates are used as a proxy for institutional collecting priorities, though donations and bequests may not reflect active curatorial choices.
  • Current API snapshot: Display status, exhibition counts, and digitization status reflect the current state of the database, not historical patterns.
  • Geographical analysis: Geographic data is based on artist birth countries and depicted locations from the API. Not all artworks have complete geographic metadata.

Visualization & Analysis Methods

  • Sample sizes shown in tooltips: All percentage-based charts include actual counts in hover tooltips (e.g., "Female: 15.2% (23 of 151 works)") to help assess the reliability of percentages.
  • Percentages on small samples: Interpret with caution. Charts may show high percentages based on very few artworks.
  • Median vs. Average: For geographic distance analysis ("Distance from home"), median distances are reported instead of averages to provide robust comparisons not skewed by outliers (e.g., artworks depicting Greenland). When averages differ significantly from medians (>200km), both measures are shown with explanatory notes about outliers.
  • Trend analysis (linear regression): The 50-year female acquisition trend (1975-2025) uses linear regression to visualize long-term patterns. Insights compare the first 25 years (1975-1999) against the last 25 years (2000-2025) to assess progress.
  • Color categorization (HSL-based): Artwork colors are classified into 13 families based on HSL (Hue, Saturation, Lightness) color space analysis:
    • Chromatic colors (9): Red, Orange, Yellow, Yellow-Green, Green, Cyan, Blue, Purple, Magenta
    • Achromatic colors (4): Brown, Black, Gray, White
    This provides comprehensive coverage of the color spectrum. Color data is extracted from the colors field in the API, which contains hex color codes representing dominant colors in each artwork's image.
  • Diverging bar charts: Used for nationality comparisons where bars extend left (male) and right (female) from a central axis, making gender differences visually intuitive.
  • Sankey diagrams: Used to visualize flow relationships (e.g., departments to object types) with thickness representing volume of artworks.
  • World maps & geographic visualization: Artist birth countries and depicted locations are mapped using D3.js with TopoJSON data. Distance calculations use the Haversine formula with Copenhagen (SMK location) as reference point.
  • Treemaps for color palette: Show top 100 colors as rectangles sized by frequency, displaying actual hex color codes for larger cells. Hover tooltips show color, count, and percentage.

Statistical Notes

  • No significance testing: This is exploratory data analysis. Observed differences are not tested for statistical significance. Patterns should be interpreted as descriptive, not inferential.
  • Temporal aggregation: Many charts aggregate by decade (e.g., birth years, acquisition dates) to reveal patterns while smoothing year-to-year noise.
  • Percentage calculations: Percentages are calculated within each category (e.g., "% of male artists' works" vs "% of female artists' works"), not as overall collection percentages, unless otherwise noted.

Data Quality & Limitations

  • Missing metadata: Not all artworks have complete information. Fields like creator_gender, production_year, dimensions, colors, and geographic data may be incomplete.
  • Cataloguing practices: Gender classification reflects museum cataloguing practices, which may have evolved over time and may not capture all nuances of identity.
  • Historical bias in data: The data reflects historical collecting practices and may perpetuate historical biases in representation.

Data Anomalies

  • Peaks in 1787 and 1887: The museum's database was established in 1887. Artworks with unknown acquisition dates were recorded as acquired "before 1887." In a later database version, this was interpreted as a 100-year interval (1787-1887). The data extraction process assigns these to specific years (1787 or 1887), creating artificial peaks. These represent artworks acquired before 1887 where the exact year is unknown.

Performance & Caching

  • GDPR-compliant storage: The application uses browser IndexedDB to cache API data for improved performance, with user consent required (cookie expires after 365 days).
  • Cache duration: Cached data expires after 30 days. Users can manually refresh via the "Refresh Data" button.
  • Lazy loading: Below-the-fold charts are loaded on-demand as users scroll to improve initial page load time.