Ionic Charge Balance - a functional approach

When working with water chemistry you are often presented with analytic results either send to you by a lab or probably even measured by yourself. From that analysis (usually from several) you are then supposed to draw insight about what is going on in your river/pond/aquifer. In order to be able to do that, it is mandatory that you can rely on your data. After all, any conclusion drawn from a dataset can never be better than the data itself.

In water chemistry, it is a good custom to check the plausibility of incoming analyses and one fundamental tool to do that is the determination of the ionic charge balance error (ICBE). In Germany, where I happen to live and work, the ICBE, by convention apparently stemming from “Deutscher Verband für Wasserwirtschaft und Kulturbau e.V. (DVWA) Heft 128/92” (though nobody seems to be able to dig up the original reference any more), is calculated using the following formula:

That formula comes with a convention on how to interpret the results:

  • if the sum of anions and cations both are <= 2 meq/l, an ICBE <= 5% is acceptable
  • if that sum is higher for either cations or anions, an ICBE <= 2% is acceptable

It is worth mentioning that Germany has since 1992 followed kind of their own path on this. The international standard formula comes without the factor “0.5” in the denominator. I don’t know whether the same limits on what an acceptable charge balance error is apply.

Anyway, the formula to calculate the ICBE looks straightforward enough, so it should be no problem to apply it to your water data, right? Alas, there are again a couple of catches. I already mentioned, that one might have to modify the formula, depending on whether you work in a German or an international setting. Also, sometimes you might need to know the exact ICBE, while on other occasions it would be more handy to just get a quick feedback on whether your analysis passes the test or not. Luckily, the R functional approach to programming allows us to tackle all this disambiguity with reusable and flexible code snippets.

Lastly, for people new to water chemistry (and we all have at some point been there), getting your analysis into that formula can be challenging itself. What is an anion, what is a cation? And which anions and cations do I need to have measured to make a meaningful ICBE?. What does that strange unit meq/l refer to? I would like to discuss these issues at length here, too, as many explanations on the topic I found online again seemed to cover only parts of the whole story.

Let’s start by getting us some water analyses!

library(tidyverse)     # Data wrangling and plotting
library(PeriodicTable) # Look up element properties
library(kableExtra)    # Table output formatting

# You can download the dataset for this tutorial from Github.
# It is a random sample of 100 Groundwater measurements
# made in the German state of North Rhine-Westphalia in 2018.

# The original dataset is published under the following licence:
# Datenlizenz Deutschland 2.0
# https://www.govdata.de/dl-de/by-2-0
# OpenHygrisC_gw-messstellen-messwerte_EPSG25832_CSV.zip
# https://www.opengeodata.nrw.de/produkte/umwelt_klima/wasser/hygrisc/

raw <- read_csv(
  "https://raw.githubusercontent.com/Ignimbrit/exchange/master/data/2020/HygrisC_GWsample.csv"
  ) %>% 
  select(-Lizenz)

# All Concentrations in `raw` are given in mg/l
kable(raw) %>% 
  kable_styling() %>% 
  scroll_box(width = "700px", height = "400px")
well_no date Na K Mg Ca Mn Fe HCO3 NO3 NH4 PO4 SO4 Cl
73564217 2018-08-15 42.0 8.30 20.0 160.0 0.0020 0.020 406.0 27.00470 0.06440 0.03077 110.0 81.0
100150287 2018-09-24 26.0 7.40 13.0 120.0 1.3000 16.000 272.0 1.32810 3.86400 0.03077 82.0 52.0
80302555 2018-09-03 14.0 22.00 14.0 72.0 0.3100 0.035 16.0 146.09100 0.06440 0.04000 110.0 32.0
80302180 2018-06-22 17.0 8.40 5.7 29.0 1.2000 0.180 1.0 42.49920 0.06440 0.03077 60.0 32.0
110250035 2018-08-08 54.0 9.50 9.4 200.0 0.0250 0.024 460.0 1.32810 0.06440 0.06462 120.0 91.0
99580020 2018-04-25 4.5 1.10 4.9 91.0 0.0066 0.025 221.0 8.85400 0.06440 0.04616 49.0 9.3
10407194 2018-07-10 4.5 0.77 2.2 5.4 0.0068 0.020 12.0 9.29670 0.06440 0.03077 5.8 7.9
129660255 2018-06-11 2.5 0.66 5.0 6.1 0.0020 0.020 30.0 6.64050 0.06440 0.03692 8.1 5.0
110040260 2018-05-29 8.3 3.90 14.0 130.0 0.0500 0.300 383.0 1.32810 0.15456 0.03077 20.0 21.0
59540515 2018-07-04 31.0 4.40 13.0 53.0 0.0022 0.020 128.0 2.12496 0.15456 0.08923 68.0 45.0
289194611 2018-03-14 55.0 7.10 18.2 85.9 0.0060 0.030 255.0 31.00002 0.09985 0.06154 97.0 63.0
70284118 2018-06-13 38.0 2.20 7.5 43.0 0.0020 0.098 112.0 13.28100 0.06440 0.11077 54.0 33.0
80200758 2018-09-13 15.0 2.90 11.0 31.0 0.0340 0.020 14.0 57.55100 0.30912 0.03692 57.0 25.0
70276110 2018-04-11 340.0 18.00 39.0 100.0 0.0420 0.020 744.0 16.82260 0.10304 0.33847 140.0 290.0
76866713 2018-09-20 48.0 4.10 19.0 190.0 0.0020 0.020 530.0 11.06750 0.06440 0.04000 70.0 97.0
110210086 2018-09-04 52.0 2.70 8.6 130.0 0.0940 0.360 412.0 1.32810 0.20608 0.03077 43.0 66.0
110070173 2018-04-11 14.0 2.70 5.6 33.0 1.3000 62.000 101.0 1.32810 0.19320 0.03077 86.0 26.0
59540837 2018-02-13 140.0 5.80 11.0 42.0 0.0020 0.020 306.0 16.82260 0.06440 0.03077 64.0 85.0
73727088 2018-02-23 4.4 1.00 7.3 9.7 0.0036 0.020 34.0 9.73940 0.06440 0.03077 22.0 6.8
10409087 2018-06-20 5.9 1.10 37.0 86.0 0.1200 0.020 334.0 23.02040 0.06440 0.03077 30.0 18.0
110250126 2018-08-09 24.0 2.70 20.0 180.0 0.0240 0.240 458.0 1.32810 0.18032 0.03077 92.0 91.0
110220894 2018-07-16 11.0 3.80 6.5 98.0 0.0390 0.040 184.0 1.59372 0.06440 0.03077 71.0 46.0
21000086 2018-05-15 30.0 1.50 2.7 95.0 0.3500 3.800 194.0 1.32810 0.64400 0.03077 76.0 50.0
100702119 2018-10-24 18.7 1.20 10.3 130.0 0.0200 0.040 272.1 28.00002 0.09985 0.04015 85.0 41.0
98120402 2018-09-06 13.0 2.40 27.0 140.0 0.1100 4.400 539.0 1.32810 6.18240 0.04616 7.9 19.0
219480618 2018-05-17 44.0 4.60 18.3 146.0 0.0050 0.010 162.0 65.00000 0.09985 0.06154 140.0 149.0
100706940 2018-04-12 26.0 1.00 18.8 216.0 0.4800 8.410 456.9 1.99999 0.29954 0.79282 240.0 33.0
94120341 2018-04-26 11.0 2.10 5.7 160.0 0.0098 0.020 340.0 35.41600 0.09016 0.03077 57.0 32.0
94140273 2018-04-25 12.0 2.50 7.6 23.0 0.1200 0.028 59.0 1.37237 0.06440 0.03077 33.0 18.0
73561915 2018-07-31 28.0 2.70 20.0 150.0 0.0020 0.020 363.0 34.53060 0.06440 0.03077 100.0 53.0
100750059 2018-04-16 75.9 13.50 38.5 110.0 0.0100 0.010 264.7 18.00000 0.09985 0.20071 91.0 210.0
76811311 2018-06-25 21.0 3.60 5.4 23.0 0.0020 0.020 79.0 12.83830 0.06440 0.03077 21.0 26.0
110060106 2018-08-27 16.0 14.00 7.1 42.0 0.0320 0.020 39.0 66.40500 0.06440 0.06462 35.0 38.0
26531124 2018-05-24 12.0 29.00 6.3 120.0 0.0680 0.040 288.0 48.69700 0.06440 0.03077 52.0 20.0
110200214 2018-07-10 29.0 8.00 8.9 37.0 0.0940 0.310 29.0 31.87440 0.30912 0.03077 64.0 56.0
26540216 2018-05-15 82.0 9.20 2.1 120.0 0.1700 1.700 347.0 1.32810 0.56672 0.03385 8.5 120.0
129660188 2018-06-27 12.0 0.83 19.0 120.0 0.0020 0.036 345.0 30.10360 0.06440 0.03077 17.0 29.0
26504595 2018-10-23 6.4 1.40 26.2 103.0 0.0100 0.010 316.6 33.00001 0.09985 0.04015 58.0 20.0
10203199 2018-05-23 24.0 1.20 11.0 83.0 0.0020 0.020 203.0 44.27000 0.06440 0.08308 50.0 41.0
26500012 2018-01-25 5.2 1.20 7.0 130.0 0.0070 0.020 271.0 14.60910 0.06440 0.03077 10.0 7.1
80303328 2018-06-29 22.0 12.00 14.0 130.0 1.0000 0.041 356.0 5.75510 0.29624 0.03077 65.0 37.0
21150023 2018-05-24 11.0 0.87 3.1 100.0 0.3800 5.600 215.0 1.32810 0.27048 0.03077 44.0 33.0
59130441 2018-07-25 26.0 76.00 20.0 170.0 0.0020 0.020 513.0 41.17110 0.06440 0.28308 130.0 37.0
279481317 2018-08-27 51.0 1.40 25.2 191.0 0.0050 0.040 327.0 72.99999 0.09985 0.09231 127.0 161.0
10301574 2018-08-27 22.0 1.40 11.0 65.0 0.0021 0.039 100.0 29.21820 0.06440 0.03077 93.0 36.0
106530033 2018-05-28 16.0 2.60 11.0 150.0 0.0084 0.180 293.0 88.54000 0.06440 0.03077 68.0 28.0
100700536 2018-04-26 16.3 6.40 6.1 34.7 1.4400 33.100 65.3 4.99999 0.54915 1.91685 66.0 53.0
106530021 2018-05-28 29.0 7.70 9.1 150.0 0.0110 0.420 322.0 123.95600 0.06440 0.03077 46.0 36.0
70134212 2018-09-13 77.0 9.90 29.0 98.0 0.0020 0.091 325.0 27.44740 0.06440 0.06769 99.0 100.0
106502062 2018-04-18 8.7 1.10 7.2 36.3 0.0500 0.600 105.5 1.00002 0.09985 0.04015 20.0 13.0
91112000 2018-09-06 18.0 2.80 18.0 130.0 0.1600 1.300 367.0 1.32810 1.05616 0.03077 92.0 29.0
100761161 2018-03-22 13.2 1.60 14.0 128.0 0.0300 0.030 380.0 16.00002 0.09985 0.04015 59.0 18.0
10410053 2018-10-25 4.0 0.55 44.0 80.0 0.0020 0.020 380.0 28.33280 0.06440 0.03077 23.0 9.0
73311716 2018-07-26 23.0 6.30 21.0 130.0 0.0020 0.020 352.0 26.11930 0.06440 0.03077 100.0 46.0
73512618 2018-08-08 45.0 7.30 23.0 170.0 0.0020 0.035 383.0 53.12400 0.06440 0.03077 150.0 59.0
110200354 2018-07-02 10.0 3.00 8.3 130.0 1.4000 12.000 346.0 1.32810 0.83720 0.03077 52.0 20.0
60220855 2018-05-16 320.0 21.00 16.0 38.0 0.0034 0.260 677.0 1.32810 0.87584 0.03385 93.0 150.0
16003998 2018-01-30 8.0 1.10 1.8 180.0 0.0020 0.020 371.0 20.36420 0.06440 0.15077 59.0 21.0
110200445 2018-06-21 27.0 12.00 29.0 130.0 0.2900 0.037 294.0 1.32810 0.06440 0.03077 120.0 86.0
91140602 2018-07-16 15.0 2.00 22.0 190.0 0.2700 0.410 523.0 4.02857 0.06440 0.03077 40.0 56.0
26540514 2018-05-23 7.3 9.30 2.6 85.0 0.0150 0.020 128.0 88.54000 0.06440 0.03077 27.0 13.0
113109477 2018-10-16 16.0 13.00 7.5 130.0 0.0590 0.510 281.0 1.32810 0.06440 0.03077 110.0 31.0
10202201 2018-03-15 40.0 3.70 11.0 36.0 0.0160 0.330 77.0 21.24960 0.06440 0.03077 84.0 36.0
26541117 2018-05-17 15.0 10.00 2.7 75.0 0.1700 2.800 116.0 43.82730 0.06440 0.03077 62.0 27.0
70200518 2018-07-02 20.0 2.40 9.9 59.0 0.0020 0.020 170.0 6.64050 0.06440 0.03077 27.0 29.0
100706680 2018-04-12 61.5 7.80 16.3 162.0 0.1500 0.310 354.4 16.99999 0.09985 0.04015 135.0 119.0
10201117 2018-03-15 22.0 3.00 21.0 150.0 0.0020 0.020 265.0 61.97800 0.06440 0.04616 130.0 66.0
110151094 2018-09-26 39.0 1.10 15.0 130.0 0.1400 0.071 363.0 1.32810 0.69552 0.03077 29.0 76.0
110060179 2018-09-03 81.0 12.00 9.1 120.0 0.0027 0.020 287.0 41.17110 0.06440 0.03077 72.0 96.0
10410077 2018-07-03 220.0 2.90 22.0 65.0 1.5000 0.020 811.0 1.32810 0.23184 0.03077 11.0 12.0
110200251 2018-07-03 8.6 6.70 5.4 53.0 0.0210 0.036 26.0 61.97800 0.06440 0.03077 70.0 10.0
99780082 2018-08-01 18.0 2.10 9.3 59.0 0.0020 0.025 133.0 11.06750 0.06440 0.03077 54.0 30.0
26541154 2018-05-17 15.0 11.00 4.7 97.0 0.8200 0.066 225.0 23.02040 0.06440 0.03077 55.0 19.0
40304024 2018-06-20 16.0 3.20 19.0 70.0 0.0020 0.020 48.0 97.39400 0.06440 0.36924 99.0 33.0
100706617 2018-11-21 19.1 0.90 9.9 135.0 0.7400 2.800 319.6 1.00002 0.17971 0.22077 101.0 41.0
100140920 2018-06-26 18.0 2.00 10.0 220.0 0.2400 3.100 300.0 1.32810 0.20608 0.03077 110.0 35.0
70156116 2018-10-19 51.0 6.80 34.0 120.0 0.4500 7.400 269.0 1.81507 0.20608 0.03077 250.0 49.0
219971614 2018-08-07 21.0 0.83 23.5 173.0 0.0050 0.010 346.0 77.00001 0.09985 0.06154 111.0 85.0
106506043 2018-04-04 7.9 1.00 29.2 65.1 0.0100 0.010 244.0 10.99999 0.09985 0.04015 79.0 10.0
100140750 2018-10-18 100.0 3.20 9.7 160.0 1.1000 18.000 274.0 1.32810 0.29624 0.03077 88.0 290.0
70157110 2018-10-11 57.0 5.40 22.0 130.0 0.0020 0.020 361.0 23.46310 0.06440 0.03077 120.0 83.0
91153300 2018-07-31 13.0 14.00 4.6 190.0 0.0760 0.020 464.0 6.19780 0.06440 9.93871 40.0 26.0
26490997 2018-03-14 23.2 4.00 5.5 116.0 0.2670 1.630 231.0 27.60000 0.13979 0.03009 65.8 52.5
94190173 2018-05-17 13.0 2.40 16.0 160.0 0.0020 0.020 292.0 37.62950 0.06440 0.04616 180.0 26.0
59130726 2018-08-06 23.0 5.40 19.0 140.0 3.7000 1.300 422.0 1.32810 1.14632 0.03077 76.0 28.0
59620213 2018-06-04 5.3 0.95 8.6 61.0 0.0020 0.020 158.0 15.05180 0.06440 0.03077 40.0 6.6
80303237 2018-07-09 11.0 5.50 9.3 86.0 0.0044 0.040 183.0 53.12400 0.06440 0.08616 62.0 16.0
10201415 2018-05-07 21.0 1.20 17.5 129.0 0.0050 0.040 230.0 92.00001 0.09985 0.12308 97.0 50.0
59621280 2018-10-16 55.0 2.40 19.0 150.0 0.0200 0.020 242.0 15.05180 0.06440 0.03077 150.0 130.0
110240133 2018-05-30 8.6 1.50 1.6 68.0 0.0170 0.029 173.0 10.62480 0.27048 0.11385 17.0 7.5
26503153 2018-03-15 7.5 1.50 38.5 132.0 0.0100 0.010 368.4 22.00002 0.09985 0.04015 163.0 16.0
20103074 2018-12-17 36.0 97.00 11.0 83.0 0.6600 0.100 223.0 27.45000 0.50000 0.71000 130.0 66.0
73746710 2018-06-20 51.0 6.90 21.0 100.0 0.0020 0.020 316.0 23.02040 0.06440 0.03077 68.0 78.0
80303183 2018-11-06 21.0 3.40 12.0 90.0 0.0020 0.020 63.0 106.24800 0.06440 0.09846 110.0 48.0
110050010 2018-07-11 34.0 9.60 11.0 69.0 0.0240 3.400 64.0 1.32810 0.19320 0.04923 130.0 50.0
59130570 2018-07-23 100.0 3.80 28.0 120.0 0.0090 1.500 640.0 1.32810 3.22000 0.03077 47.0 44.0
106502463 2018-04-25 7.3 0.80 23.0 51.3 0.0100 0.010 171.4 22.00002 0.09985 0.04015 62.0 16.0
106506031 2018-10-18 11.5 0.90 28.7 63.5 0.0100 0.010 277.6 13.00002 0.09985 0.04015 52.0 24.0
110070057 2018-08-06 13.0 16.00 6.0 71.0 0.0021 0.051 90.0 119.52900 0.06440 1.38465 43.0 19.0
76809407 2018-02-26 21.0 1.10 10.0 47.0 0.0049 0.048 97.0 7.96860 0.06440 0.03077 8.0 82.0

In its current form, that table is not easy to work with in R. We need a simple way to apply transformations to all measurements at once. It is therefore better to organize the data in columns, so we can readily push them into tidyverse functions. Also we need to know which elements we actually deal with.

df_base <- raw %>% 
  pivot_longer(Na:Cl, names_to = "Param", values_to = "mg/l")

# Check which ions were analyzed in the dataset
df_base$Param %>% unique() %>% paste0(collapse = ", ") %>% cat()
## Na, K, Mg, Ca, Mn, Fe, HCO3, NO3, NH4, PO4, SO4, Cl

So we have actually quite a broad inventory of elements and molecules here. All of those can, and if they are available they should, go into your ICBE calculation. However, in many contexts it might already be possible to obtain good results when only Na, K, Mg, Ca, HCO3, SO4 & Cl are available, as they often constitute the bulk of dissolved ions in natural waters.

In order to derive gainful insight about our measurements of those parameters, though, we need to have additional knowledge about their chemical properties.

# The mass of an ion can be acquired from {PeriodicTable} function `mass`
# The charge, unfortunately, you *have to know*. Explaining this basic concept
# of aqueous chemistry is a bit beyond the scope of this blog post.
df_specs <- tribble(
  ~Param, ~charge, ~`mass [g/mole]`,
    "Na",      1L,   mass("Na"),
     "K",      1L,   mass("K"),
    "Mg",      2L,   mass("Mg"),
    "Ca",      2L,   mass("Ca"),
    "Mn",      2L,   mass("Mn"),
    "Fe",      2L,   mass("Fe"),
  "HCO3",     -1L,   mass("H") + 1*mass("C") + 3*mass("O"),
   "NO3",     -1L,   mass("N") + 3*mass("O"),
   "NH4",      1L,   mass("N") + 4*mass("H"),
   "PO4",     -3L,   mass("P") + 4*mass("O"),
   "SO4",     -2L,   mass("S") + 4*mass("O"),
    "Cl",     -1L,   mass("Cl")
)

Now that we do have our analyses and know the properties of the ions we are dealing with, we can transform the concentrations that we’ve been given in mg/l (milligram per liter) into meq/l (millimole equivalent per liter), the unit that goes into the ICBE formula. mg/l refers to the mass of a substance (e.g. chloride) solved in a liquid (water, in this case). That much is easy to understand. But what is mole and what does mole equivalent mean?.

While the unit mg (1/1000 g) refers to the mass of the substance, mole refers to its amount, the very number of atomic particles of a certain kind. As different kinds of particles, say Na and Cl, have different masses, weighing in the same amount of Na and Cl with a balance will provide you with a different number of particles. One mole is short hand for 6.02214076×10^23 particles, just like a dozen is short hand for 12.

To go from a mass in g to an amount of particles in mole you need to know the molar mass of an element or molecule. The molar mass for every element can be read from any decent periodic table or called in R directly with PeriodicTable::mass(). It comes in g/mole, so if you divide your substance mass in g by the element’s molar mass, the g will cancel out and you end up with your substance amount in mole.

Why is it important to know how many particles exactly you have? Because as we are dealing with ions here, every particle comes with a charge and we need to add up those numbers. And this is where the transition from “mole” to “mole equivalent” comes in. All ions are charged. But some ions are more charged than others. As we are trying to ultimately calculate a charge balance here, we will eventually have to multiply the amount of particles we have with the amount of charge that is assigned to the individual particles.

That was a lot of text that probably simultaneously bored the people with a chemical background and confused those without. Therefore now let’s jump back to the code and look at how to actually perform the described unit conversion.

df_mole <- df_base %>% 
  left_join( # First I'll attach the ion properties to my concentrations
    df_specs, by = "Param"
  ) %>% 
  mutate(    # Next let's add a column specifying whether charge is + or -
    iontype = case_when(
      charge > 0 ~ "Cation",
      charge < 0 ~ "Anion"
    )
  ) %>% 
  mutate(    # Here is the actual unit conversion, as explicit as possible
    `g/l`     = `mg/l`    / 1000,
    `mole/l`  = `g/l`     / `mass [g/mole]`,
    `mmole/l` = `mole/l`  * 1000,
    `meq/l`   = `mmole/l` * abs(charge) # number of particle times their charge
  )

# Now that we have the correct unit, all we need to do is sum up
# our anions and cations.
df_Ionsum <- df_mole %>% 
  group_by(well_no, date, iontype) %>% 
  summarise(ionsum = sum(`meq/l`)) %>% 
  ungroup() %>% 
  pivot_wider(names_from = iontype, values_from = ionsum)

With our analytic data in the right shape and unit, we can go ahead and write a function that takes the sum of anions and cations and calculates the ICBE.

ionic_charge_error <- function(
  sum_cations, sum_anions, 
  absolute = TRUE,         # differentiate between positive & negative error?
  mode = "GER",            # which calculation scheme to use?
  logical = FALSE,         # return only whether test was passed?
  discard_threshold = "default"  # How much error is too much?
){
  
  # Should the result be a positive number only (the default)?
  if(absolute == TRUE){
    numerator <- abs(sum_cations-sum_anions)
  } else {
    numerator <- sum_cations-sum_anions
  }
  
  # Use the deviant German calculation scheme?
  if(mode == "GER"){
    denominator <- (sum_cations+sum_anions)*0.5
  } else if(mode == "INT"){
    denominator <- (sum_cations+sum_anions)
  } else {
    stop("mode not supported")
  }
  
  # Calculate the actual charge balance
  ionic_charge_error <- numerator/denominator*100
  
  # Sometimes you do only care whether the charge error is acceptable or not
  if(logical == TRUE){
    
    if(discard_threshold == "default"){
      # This is another (probably German?) convention
      if(any(c(sum_anions, sum_cations) > 2)){limit <- 2L} else {limit <- 5L}
    } else {
      # Alternatively, You can just supply your own, personal threshold
      limit <- discard_threshold
    }
    
    if(ionic_charge_error > limit){
      FALSE
    } else {
      TRUE
    }
  } else {
    ionic_charge_error
  }
}

As you can see, an R function allows a lot of flexibility for advanced users while, by setting reasonable defaults, still remains easily applicable even for people not familiar with every detail of the underlying concept.

Finally, we can now have a look at the quality of those water analyses I dug up.

df_icbe <- df_Ionsum %>% 
  mutate( # I'll try both modes of the function: numeric and logical
    charge_error = map2_dbl(Cation, Anion, ionic_charge_error),
    keep = map2_lgl(Cation, Anion, ionic_charge_error, logical = TRUE)
  )

# The result is a large data.frame with lot's of values.  
# To get a better understanding of the results, a plot can help.

ggplot(
  data = df_icbe %>% 
    mutate(
      `acceptable error` = map2_chr(
        Anion, Cation, function(x, y){
          if(any(c(x, y) > 2)){"2%"} else {"5%"}
        }
      ) 
    ),
  mapping = aes(
    x = charge_error,
    fill = `acceptable error`
  )
) +
  geom_histogram(
    binwidth = 1, color = "black", boundary = 0
  ) +
  geom_vline(
    aes(
      xintercept = parse_number(`acceptable error`),
      color = `acceptable error`
      ),
    size = 2
    ) +
  scale_fill_manual(values = c("firebrick", "steelblue")) +
  scale_color_manual(values = c("red", "blue")) +
  scale_x_continuous(breaks = seq(0, 50, 2)) +
  scale_y_continuous(breaks = seq(0, 20, 2)) +
  labs(
    title = "Calculated Ionic Charge Balance Error",
    subtitle = "100 groundwater samples collected 2018 in North Rhine-Westphalia, Germany",
    x = "charge balance error in percent"
  ) +
  theme_bw() +
  theme(
    axis.text = element_text(size = 11, color = "black"),
    axis.title = element_text(size = 12, color = "black"),
    legend.text = element_text(size = 11, color = "black"),
    legend.title = element_text(size = 12, color = "black"),
    legend.position = "bottom",
    legend.direction = "horizontal"
  )

As you can see, the vast majority of analyzed samples (97 in 100) come with a rather high mineralization, leading to either cationsum or anionsum being > 2 meq/l. Only 28 out of 100 samples end up having an acceptable charge balance. Reasons for this can be manifold and range from bad sampling over laboratory issues to the presence of organic contaminants that might skew your charge balance. Either way, if your dataset has a charge balance rate that looks like the one in the histogram above, you should probably talk to your sampler.

Sören Wilke
Sören Wilke
Dr. rer. nat. Earth Science

My research interests include hydrogeochemistry, geologic modelling and geostatistics.