Molecular self-assembly: Quantifying the balance between intermolecular attraction and repulsion from distance and length distributions

Molecular self-assembly on surfaces constitutes a powerful method for creating tailor-made surface structures with dedicated functionalities. Varying the intermolecular interactions allows for tuning the resulting molecular structures in a rational fashion. So far, however, the discussion of the involved intermolecular interactions is often limited to attractive forces only. In real systems, the intermolecular interaction can be composed of both, attractive and repulsive forces. Adjusting the balance between these interactions provides a promising strategy for extending the structural variety in molecular self-assembly on surfaces. This strategy, however, relies on a method to quantify the involved interactions. Here, we investigate a molecular model system of 3-hydroxybenzoic acid molecules on calcite (10.4) in ultrahigh vacuum. This system offers both anisotropic short-range attraction and long-range repulsion between the molecules, resulting in the self-assembly of molecular stripes. We analyze the stripe-to-stripe distance distribution and the stripe length distribution and compare these distributions with analytical expressions from an anisotropic Ising model with additional repulsive interaction. We show that this approach allows to extract quantitative information about the strength of the attractive and repulsive interactions. Our work demonstrates how the detailed analysis of the self-assembled structures can be used to obtain quantitative insight into the molecule-molecule interactions


Introduction
Molecular self-assembly has attracted great attention due to the impressive structural and functional variability that can be achieved with this versatile bottom-up method for supramolecular material synthesis. 1A clever design of the molecular building blocks allows controlling the resulting structures and tailoring them to the specific needs of a given application. 23][14][15][16] The vast majority of these studies has focused on attractive molecule-molecule interactions such as hydrogen bonds, van-der-Waals forces, π − π interactions or electrostatics. 179][20][21][22][23][24] In the latter examples, the electrostatic repulsion between permanent as well as adsorptioninduced electrical dipoles has been discussed as a promising way to enhance the structural complexity in molecular self-assembly on surfaces.Intermolecular repulsion gives rise to the formation of homogeneously dispersed individual molecules 18,19 , extended rows with well-defined row-to-row distances 9,22 as well as islands 24 and clusters 25 with well-defined sizes.
So far, however, the interplay between attractive and repulsive interactions on the molecular structure formation has barely been explored as a powerful strategy to control both the shape and the size of self-assembled molecular structures on surfaces. 24For systematic exploring the balance between molecular attraction and repulsion in molecular self-assembly, it is essential to quantification the involved interactions.
Here, we present a molecular model system of adsorbed 3-hydroxybenzoic acid molecules on a calcite (10.4) surface that provides both, anisotropic attraction and repulsion.For this system, the molecular self-assembly has been shown to be governed by the balance between short-range intermolecular attraction and longrange intermolecular repulsion. 22,23This balance results in the formation of molecular stripes with a coverage-dependent stripe-tostripe distance distribution. 22n order to determine the strength of the involved attractive and repulsive interactions, we consider an anisotropic Ising model with additional long-range dipole-dipole interaction.This model is generally applicable to stripe formation induced by intermolecular interaction.Based on a mean-field treatment we derive analytical expressions for stripe-to-stripe distance and stripe length distributions.
The theory is compared with experimental data obtained by atomic force microscopy images.An analysis of these images yields coverage-dependent stripe-to-stripe distance distributions as well as stripe length distributions.By fitting the theoretical predictions to the distance and length distributions we extract the strength of the attractive and repulsive molecule-molecule interactions.Our work constitutes an example of how the mesoscopic structural information can be used for gaining quantitative molecular-level insights into the driving forces at play.

Methods
All dynamic atomic force microscopy (AFM) images shown in this work were acquired with a variable-temperature atomic force microscope (VT-AFM XA from ScientaOmicron, Germany) operating under ultrahigh vacuum conditions (p < 10 −11 mbar).We used silicon cantilevers purchased from NanoWorld (Neuchâtel, Switzerland) with an eigenfrequency of around 300 kHz (type PPP-NCH).To remove contaminations and a possible oxide layer, the cantilevers were sputtered with Ar + at 2 keV for 10 minutes prior to use.The calcite crystals (Korth Kristalle GmbH, Germany) were prepared ex situ by mild ultrasonication in acetone and isopropanol for 15 min each.Inside the chamber, the crystals were degassed at about 580 K for 2 h.After this degassing step, the crystals were cleaved and annealed at about 540 K for 1 h.The quality of the crystal surface was then checked by collecting an image of typically 100 nm 2 size.The 3-hydroxybenzoic acid (3-HBA) molecules (99 % purity) were purchased from Sigma-Aldrich and used after degassing for 10 min at a temperature higher than 320 K.A homebuilt Knudsen cell with a glass crucible was used for sublimation.For the crucible used here, a temperature of 309 K resulted in a flux of approximately 0.01 monolayers per minute (ML/min).During sublimation, the partial pressure in the chamber was in the range of 1 × 10 −12 mbar for 3-HBA (m/z = 137 u/e) as measured with a mass spectrometer from MKS (e-Vision 2).For molecule deposition, the calcite sample was cooled to a temperature below 220 K.
The AFM measurements were performed at a sample temperature of 290 K. 1 This temperature is chosen such that the dynamics are fast enough to ensure thermodynamic equilibrium but slow enough to minimize effects on the statistical analysis.The images were acquired with a pixel resolution of 4000 × 4000 Px and a speed of 0.32 ms/Px, resulting in a measurement time of roughly 3 h/image.The image size was 1500 × 1500 nm 2 , yielding a resolution of 0.375 × 0.375 nm 2 /Px.
We present measurement series for three different coverages, with multiple images measured at the same location.The number of images per coverage in each series differs since we had to sort out some of the images due to experimental difficulties.The remaining 14 images resulted in total amounts of 1758254 stripe-tostripe distances d and 17015 stripe lengths l.
To obtain the stripe-to-stripe distance and the length distributions from the AFM images we proceeded as follows.After a plane subtraction and line-by-line correction, 26 the images were calibrated and corrected for linear drift. 27ach image was segmented using a trainable machine learning tool. 28Afterwards neighboring pixels were connected and the connected structures fitted with a rectangle. 29,30All relevant data of the fit rectangles (centroid position, length l and orientation) were collected and reconstructed as line segments for further analysis using the package SpatStat within the software R. 31,32 We sort out stripes shorter than 5 nm since these are difficult to distinguish from wrongly fitted structures.For simplicity, we do not exclude stripes limited by image edges.We define the stripe-to-stripe distance as the distance between each 3-HBA dimer and its next-neighbor in [010] direction.Thus, we get one distance per molecular dimer but only one length per stripe, which implies that the number of measured stripe distances is much larger than the number of stripe lengths.

Experimental Results
When depositing 3-HBA molecules onto the (10.4) surface of calcite kept in ultrahigh vacuum, the molecules self-assemble into doublerows as has been reported previously. 22The molecular double-rows can be identified in AFM images as stripes oriented along the [421] direction of the calcite crystal, see Figure 1.Two molecules, one out of each row, form the stripe basis with a periodicity of 0.8 nm. 22We call this basis a 3-HBA dimer.Each image in Figure 1  In Figure 2, the difference between the image shown in Figure 1 (b) and an image taken six hours before is shown.The areas marked in blue (red) are regions of disappearing (appearing) molecules over time.From this difference image, it becomes evident that the molecules are mobile at a sample temperature of 290 K.More specifically, a total of ≈ 30% of the molecules, including entire stripes, change position within the measurement time of six hours, while about 70% of the structures do not change.We can thus expect that the statistics of a single image is not strongly affected by the long measurement time of roughly 3 h for a single image.
From the last and the first image of series III with a time difference of 18 hours, we have generated the stripe-to-stripe distance distributions shown in Figure 3(a), using a bin size of 0.5 nm.Comparing these two distributions re-  veals no significant difference.Both distributions exhibit a distinct maximum at a distance of 10 to 12 nm, implying that the stripes are not randomly placed on the surface.A random placement would result in a geometric distribution. 22In addition have determined the stripe length distributions for the two images, which are shown in Figure 3 (b) for a bin size of 4 nm.Again the comparison of the respective two length distributions yields no significant difference.
To conclude, during 18 hours of measurement time appreciable rearrangements of the molecules occur, but the stripe-to-stripe and the length distributions do not change.Hence the stripe patterns can be regarded to reflect equilibrium structures.This justifies to analyze all images of each measurement series to improve the statistics.
In Figure 4(a) we show the stripe-to-stripe distance obtained from all images in each series (four images for series I and II, and six images for series III).These distributions are coveragedependent, 22 exhibiting a decrease of mean distance ( dI = 24.1 nm, dII = 18.2 nm and dIII = 12.2 nm), standard deviation (σ I = 12.2 nm, σ II = 8.5 nm and σ III = 4.0 nm) and position of the maximum with increasing coverage.The corresponding length distributions are shown in Figure 4(b).As explained above, the number of counts in each bin is much less than for the stripe distances.Overall, the length distributions decrease monotonically for large l.For the two higher coverages (series II and III), local maxima in the range l ≈ 50−100 nm appear.A corresponding maximum, however, is not clearly detectable at the lowest coverage of increasing mean length and standard deviation with increasing coverage, both values are smaller for series II compared to series I.

Theoretical modeling
For the equilibrated system of 3-HBA molecules on calcite there it has been proposed that repulsive interactions are caused by a charge transfer between surface and molecules, leading to dipolar moments perpendicular to the surface. 22As the stripe formation is formed by dimers, it is convenient to consider these as molecular units occupying lattice sites.We refer to them as "particles".The lattice sites correspond to the anchoring positions on the calcite surface.The analysis of AFM images shows that the stripes have a width of 2 nm and a periodicity of 0.8 nm. 22This can be represented by a rectangular lattice with spacings a = 0.8a 0 in stripe direction and a ⊥ = 2a 0 perpendicular to it, where a 0 = 1 nm sets our length unit.
The interplay between attractive and dipolar interactions is described by the lattice gas Hamiltonian Here n i are occupation numbers, i.e., n i = 1 if the site i is occupied by a particle and zero otherwise.The sum over i and j is restricted to nearest-neighbor (NN) sites in stripe direction corresponding to an anisotropic Ising model, and r kl is the (dimensionless) distance between sites k and l.The interaction parameter J > 0 quantifies the strength of the attractive nearestneighbor interaction.The strength of the repulsive dipole interaction is given by where p is the dipole moment of one dimer and 0 is the dielectric permeability of the vacuum.
In the following two subsections, we discuss analytical approaches to get insight into equilibrated stripe patterns for Γ = 0, and for Γ > 0 based on approximate one-dimensional treat-ments.This allows one to determine the interaction parameters J and Γ by fitting analytical expressions to match experimentally observed stripe distance and length distributions.For convenient notation in the following theoretical treatment, the stripe length l is given in units of a and the stripe distance d in units of a ⊥ .

Stripe formation for Γ = 0
In the absence of dipolar interactions, the stripe positions in perpendicular direction are uncorrelated.
As a consequence, the stripe distance distribution For deriving the stripe length distribution, we can focus on a one-dimensional row of stripes.A stripe of length l corresponds to an occupation number sequence 01 . . .10, i.e., a configuration of two zeros separated by l ones.We denote the probability of such sequence by q l .Knowing q l , the stripe length distribution is ψ(l) = q l / ∞ l=1 q l .To determine q l , we introduce the conditional probabilities w(n i+1 |n i , n i−1 , . . ., n 1 ) of finding occupation number n i+1 if the occupation numbers n i , n i−1 . . .n 1 are given.In the grand-canonical ensemble these satisfy the Markov property w(n i+1 |n i , n i−1 , . . ., n 1 ) = w(n i+1 |n i ). 33Accordingly, q l = (1 − θ)w(1|0)w(1|1) l−1 w(0|1), where the factor (1 − θ) accounts for the first zero in the sequence, and the product of w(.|.) is the Markov chain corresponding to the occupation numbers in the sequence.The conditional probability w(1|1) is given by w(1|1) = χ 2 (1, 1)/χ 1 (1) with χ 1 (1) = θ, and the joint probability χ 2 (1, 1) is equal to the equilibrium nearest-neighbor correlator 34 Hence, the l dependence of q l is ∝ (C/θ) l , and for the length distribution we obtain in agreement with results earlier reported in Ref. 35 .For J → 0, C(0) = θ 2 and we obtain the geometric distribution Ψ 0 (l) = (1 − θ)θ l−1 .

Stripe formation for Γ > 0
The dipolar interaction for Γ > 0 leads to repulsion between pairs of particles belonging to the same stripe as well as to different stripes.It tends to shorten the stripes and to increase the stripe distances.Compared to the case Γ = 0, the stripe distance distribution is more strongly affected than the length distribution, because the latter is largely determined by the attractive nearest-neighbor interaction J (if Γ < J ).
In fact, one can expect that the length distribution for large l is still geometric as in Eq. ( 4) for Γ = 0.This is because for each Γ > 0 there is a characteristic length scale of induced correlations by the dipolar interaction.Considering long stripes to be composed of particle blocks with this length scale, the reasoning in the previous subsection leading to Eq. ( 4) is applicable with a renormalized C = C eff (J, Γ) in Eq. ( 3).Hence, the length distribution in the presence of dipolar interactions is expected to decay exponentially for large l and to show deviations from the geometric shape at small stripe lengths.
Exact analytical solutions for the distance and length distributions are not available in the presence of competing attractive nearestneighbor and dipolar interactions.We therefore rely on approximate treatments here.
As for the stripe lengths, it is instructive to first analyze whether a single isolated stripe can have an energetic minimum at a finite length.When increasing the length of this single stripe from l to l + 1, the energy changes by For a minimum to occur, ∆H(l) must be negative for l = 1 and positive for l → ∞.This implies 1 < J/Γ < ζ(3) ∼ = 1.202,where ζ(3) is the Riemann zeta function (Apéry's constant).Accordingly, a finite single stripe can form only in a narrow regime of the interaction parameters J and Γ.However, in a system of many interact-ing stripes at a given coverage, the stripes can mutually stabilize each other at finite lengths for a wide range of J and Γ.
Due to the fast convergence of the sum in Eq. ( 5), the energy change ∆H(l) for attaching one further particle to a stripe becomes essentially constant for l 10.We thus can expect Eq. ( 4) to hold for large l with C eff (J, Γ) = C(J eff ), where The corresponding approximate stripe length distribution is referred to as Ψ(l).
We expect this distribution to have the same asymptotic behavior as the true length distribution Ψ(l), i.e.
Deviations from Ψ(l) are expected to be significant for small l.If the effective nearest-neighbor interaction J eff is attractive, i.e., J > ζ(3)Γ, the energy change ∆H(l) in Eq. ( 5) is negative, implying that single particles or small stripes are energetically unfavorable compared to longer stripes.Accordingly, we expect Ψ(l) to be smaller than Ψ(l) for small l.
As for the stripe distance distribution Φ(d), we can assume that it is governed by the dipolar interaction between neighboring stripes in perpendicular direction.Applying a mean-field approach similar to that introduced in Ref. 22 , we divide the two-dimensional stripe pattern into mutually independent one-dimensional parallel bands in perpendicular direction.The bands are considered to have the same width l, where l is the mean stripe lengths.
For each stripe appearing in a band, we consider it to span the whole band, i.e. to have length l = l.In one band, the interaction U (d) between two stripes at distance d with dipole density p/a is (integrating along both stripes with parametrization s 1 and s 2 ) Hence, we have mapped each band onto a onedimensional lattice occupied by particles with interaction U (d) between neighboring stripes.The mean occupation of lattice sites is fixed by the coverage θ.In the presence of the purely repulsive U (d), it can be viewed as resulting from a confinement pressure f which hinders the particles to become infinitely separated and to give rise to a mean distance d.Our approximation Φ(d) of the stripe distance distribution thus is given by where ) and f is fixed by the condition (9b)

Application to experiments
The parameters J and Γ are estimated by fitting Φ(d) from Eq. (9a) to the distribution Φ(d), and by fitting Eq. ( 7) to the tail of Ψ(l), where Φ(d) and Ψ(l) are the distributions obtained in the experiments.We first determine Γ, and hence p = Γ4π 0 a 3 0 , by fitting Φ(d) to Φ(d) with the experimental l in Eq. ( 8).We then extract J eff by fitting the tail of Ψ(l) which yields J via Eq.(6).
Figure 5 shows fits of Φ(d) (circles, connected by solid lines) to Φ(d) (histogram) for each series, using the method of least square.The optimal values of Γ (and corresponding p) for each coverage are listed in Table 1.In all three    5.
The mean p differs by about 1 D from the optimal value for the smallest coverage.This raises the question on the sensitivity of the fitting with respect to p.We thus analyze how Φ(d) deviates from Φ(d) for even larger differences of p from its optimal value.For values p > = 8.3 D and p < = 4.3 D larger and smaller by 2 D, Φ(d) is shown in the inset of Figure 5(a).As can be seen from this inset, deviations to Φ(d) for the optimal p value are now clearly visible.We thus conclude that the error in our estimate is about ±1 D.
Taking p as the dipole moment of the 3-HBA dimer yields a dipole moment p/2 = 3.2 D for the single molecule, in fair agreement with our former estimate. 22aving determined Γ, we now analyze the stripe length distribution to determine J. Figure 6 shows the measured length distributions Ψ(l) for each series (circles).The distributions are determined by using bins of varying size with approximately equal amount of events in each bin.As expected, all distributions show an exponential decay for large l.The solid lines are fits to these exponential decays for l > 100 nm.According to Eq. ( 7), the decay length of Ψ(l) ∼ e −l/l 0 is The characteristic decay length l 0 for each experimental distribution thus yields a value J eff via Eq.( 10) in combination with Eq. ( 3).The interaction parameters J then follow from Eq. ( 6) and are listed in the fifths column of Table 1.These values lie around 0.29 eV.Our final estimate of the analysis is p = 6.3 D ± 1D and J = 0.29 ± 0.04 eV. [38]

Conclusions
In summary, we have presented an approach to estimate the strengths of short-range attractive and long-range repulsive interactions between 3-HBA molecules on a calcite surface by an analysis of stripe-to-stripe distance distributions Φ(d) and stripe length distributions Ψ(l).
Experimental distributions were determined from an analysis of three AFM image series with different coverages 0.08 ML, 0.11 ML, and 0.16 ML at a temperature 290 K.The measurements of theses series spanned time intervals of up to 18 hours.A comparison between distributions of individual images in the same series strongly suggests that the stripe patterns are in thermodynamical equilibrium.
The attractive interaction responsible for the stripe formation was considered to be an effective one with strength J between neighboring 3-HBA dimers, without resorting to details of the molecular structure.The long-range repulsive interaction is modeled as dipole-dipole interaction of characteristic strength Γ as previously proposed in Ref. 22 .It is believed to be caused by a charge transfer between the surface and 3-HBA molecules.As these molecules have specific anchoring sites on the calcite surface, the system could be described by a lattice gas corresponding to an anisotropic Ising model with additional dipolar interaction.
Based on this model, we developed mean-field approaches to derive approximate expressions for the stripe distance and length distributions with J and Γ as parameters.Fitting these parameters to the experimental distributions we obtained the estimates J = 0.29 ± 0.04 eV and p = 6.3 D ± 1D for the dipole moment p ∝ √ Γ of a 3-HBA dimer.
The modeling approach presented here is applicable also to other molecular systems selfassembling into stripe patterns, if the stripe formation is dominated by short-range attractive molecule-molecule interactions.In general, one can expect additional long-range electrostatic interactions to be present.Their impact on the structure formation depends on their type (e.g., dipolar, quadrupolar) and strength, but the core of our methodology is independent of these features.
The mean-field treatment, however, requires the formation of structures with long stripes arranging into patterns with large overlaps between neighboring parallel stripes.This requirement is fulfilled only if the repulsive interaction is not too strong compared to the attractive one, and if the coverage is not too small.The coverage must not be too high either because otherwise the structure will no longer be composed of individual stripes.For determining the respective limits of our mean-field treatment, extensive simulations of the many-body problem are needed, which is left for future research.
As long as the aforementioned requirements are met, other types of interactions can be accounted for by minor adjustments of the meanfield approach.As for the stripe distance distribution, only the effective interaction potential U (d) between stripes in Eq. ( 8) needs to be modified.As for the stripe length distribution, we expect a length scale to exist beyond which correlations within a stripe can be renormalized to an effective nearest-neighbor interaction between segments.The interplay between attractive and repulsive interaction in Φ(l) can then be accounted for by one effective coupling parameter analogous to J eff in Eq. ( 6).
From a general point of view, it should be scrutinized whether a modeling with static dipole moment is appropriate.Our use of a static dipole moment here relies on the assumption of an approximately fixed amount of charged transferred between the surface and each molecule.The results in Table 1 indicate a decreasing dipole moment with increasing coverage.This can be interpreted by a dynamic dipole moment which becomes smaller in order to compensate for additional repulsive interactions with further molecules.A change of the molecule-surface interaction as a response to a repulsive interaction has been reported earlier in Ref. [39][40][41] Dynamical dipole moments can be coped with in a theoretical treatment by introducing a molecular polarizability for the molecules.This leads to varying dipole moments in dependence of their local environment.How important these variations are, is presently unknown.The uncertainties of the values in Table 1 and the rather narrow coverage range 0.08-0.16ML does not allow us to give a firm assessment on how strong effects of a dynamical dipole moment are.Additional investigations with a wider range of coverages are needed.Further experimental and theoretical research in this direction offers promising perspectives to gain deeper insight into the impact of the interplay between repulsive and attractive interactions on molecular self-assembly.ular Self-Assembly on Bulk Insulator Surfaces".

Figure 1 :
Figure 1: Representative atomic force microscopy (AFM) topography (z p ) images of 3hydroxybenzoic acid (3-HBA) on calcite (10.4) from the three measured series I, II, and III at temperature 290 K and coverages (a) θ I = 0.08 ML, (b) θ II = 0.11 ML, and (c) θ III = 0.16 ML.All images are cutouts with a size of 1150 × 1150 nm 2 and a resolution of 3446 × 3446 Px.The fast (small arrow) and slow (large arrow) scan directions are given in the upper right corner.The surface directions are indicated by the arrows in the lower right corner.
is a representative example from one of three series I-III of measurements at a given coverage, where θ I = 0.08 ML [Figure 1(a)], θ II = 0.11 ML [Figure 1(b)] and θ III = 0.16 ML [Figure 1(c)].

Figure 2 :
Figure 2: Comparison of image shown in Figure 1 (b) and an image taken six hours before, demonstrating the redistribution of molecules.Areas where molecules disappear (appear) are marked in blue (red).

Figure 3 :
Figure 3: Comparison of histograms obtained from the first and last image of the series III.The images have a time separation of 18 hours.In (a) the counts of stripe-to-stripe distances in bins of size 0.5 nm are shown, and in (b) the counts of stripe lengths in bins of size 4 nm.The bins give the histograms obtained from the first image, and the horizontal bars marked in blue indicate the corresponding counts from the last image.

(Figure 4 :
Figure 4: Histograms of (a) stripe-to-stripe distances and (b) stripe lengths obtained from all images in the series I (0.08 ML, 4 images), II (0.11 ML, 4 images), and III (0.16 ML, 6 images) [color coding according to legend in (a)].The bin sizes are as in Figure 3.

Figure 5 :
Figure 5: Histograms of the measured distance distributions Φ(d) for the three different coverages (a) θ I = 0.08 ML, (b) θ II = 0.11 ML, and (c) θ III = 0.16 ML in comparison with the fitted theoretical distributions Φ(d) (circles, connected by solid lines).Dashed black lines correspond to Φ(d) with the mean dipole moment of p = 6.3 D. The inset in (a) shows the fitted Φ(d) for θ I (circles, connected by orange line) compared to Φ(d) for p > = 8.3 D and p < = 4.3 D (black lines).

Figure 6 :
Figure 6: Stripe length distributions Ψ(l) for the three different coverages (I-III, circles) with fits to the exponential tails for l > 100 nm (solid lines).

Table 1 :
Parameters obtained from fitting the theoretical model to the experimental stripe distance and length distributions for the three different coverages (series I-III).