Ziling Xu is a postgraduate student of the University of Science and Technology of China. Her research mainly focuses on random network
Qunqiang Feng is currently an Associate Professor at the University of Science and Technology of China (USTC). He received his Ph.D. degree from USTC in 2006. His research mainly focuses on applied probability, random network models, and network data analysis
Distances between nodes are one of the most essential subjects in the study of complex networks. In this paper, we investigate the asymptotic behaviors of two types of distances in a model of geographic attachment networks (GANs): the typical distance and the flooding time. By generating an auxiliary tree and using a continuous-time branching process, we demonstrate that in this model the typical distance is asymptotically normal, and the flooding time converges to a given constant in probability as well.
Graphical Abstract
Distances in a geographical attachment network model.
Abstract
Distances between nodes are one of the most essential subjects in the study of complex networks. In this paper, we investigate the asymptotic behaviors of two types of distances in a model of geographic attachment networks (GANs): the typical distance and the flooding time. By generating an auxiliary tree and using a continuous-time branching process, we demonstrate that in this model the typical distance is asymptotically normal, and the flooding time converges to a given constant in probability as well.
Public Summary
The asymptotic properties of the typical distance and the flooding time in a geographic attachment network (GAN) model are studied.
The typical distance of GAN is asymptotically normal.
The flooding time of GAN converges to a given constant in probability.
A significant amount of research has focused on networks as a result of the recent increase in interest in social networks[1], communication networks[2], scientific collaboration networks[3], biological networks[4], and many other types of networks. To analyze these networks, researchers have developed a number of models, many of which are widely used, such as the Watts and Strogatz model and the preference attachment model. Due to various constraints, different models have different topologies. In this paper, we concentrate on the model affected by geographical restrictions.
It is a given in social networks that people who relocate will most likely develop acquaintances with people in the area, such as their neighbors. Motivated by this idea, a geographical attachment network (GAN) model was first proposed in Ref. [5]. In this network model, its size (the number of nodes in it) increases over time, and the newly added nodes are only connected to the nodes that are closest to them.
The following guidelines can be used to generate the GAN model presented in this work. We start with an initial state (at time n=0) of three nodes distributed on a ring, all of which are connected to one another. That is, the initial graph GAN(0) is a triangle; see Fig. 1a. At time n≥1, the network GAN(n) is obtained from GAN(n−1) in the following manner: A new node is placed in an internode interval chosen uniformly at random from the n+2 existing nodes along the ring and connected to its two nearest neighbors (one on either side). This GAN model is the simplest case in Ref. [5]; see also Ref. [6]. For convenience in below, here we may define potential nodes and active intervals. If two endpoints of an interval are adjacent on the ring, the interval is said to be active. Each active interval corresponds to a potential node, which is a node that may be chosen as a new node in the future. Instead, we refer to nodes that are added to the network as actual nodes. There are two potential edges that could connect each potential node to its neighbors. Fig. 1 shows an illustration of this network at times n=0, 1, and 2. For the variants of GAN models, we refer to Refs. [7–9].
Figure
1.
Illustration of the growing GAN model with potential nodes for time n=0, 1, and 2, where points ∙ represent nodes in the network, points ∘ are potential nodes, and red dashed lines are potential edges.
A path in network G is a subnetwork with a sequence of successive edges that joins a sequence of distinct nodes (except, possibly, the first and last node). The length of a path is the number of edges in it. Despite the fact that the geographical distance along the ring is used in the generating process of the GAN model, in this paper, we consider the graphic distance, i.e., the distance between a pair of nodes is defined as the number of edges along the shortest path connecting the nodes. The diameter of a network is the largest distance among all pairs of nodes.
Several properties of the GAN model are obtained in Ref. [5] with heuristic arguments and computational simulations, and Ref. [6] using the rigorous probabilistic method. Let Pk,n be the proportion of nodes with degree k≥2 in the network GAN(n). It is shown in Ref. [6] that as n→∞,
Pk,np→13(23)k−2,k=2,3,⋯,n.
Then the GAN model is not a scale-free network[10], in which the limiting degree distribution is a power law. This indicates that the network model considered here is essentially different from the random Apollonian network model (see, for example, Refs. [11–14]), which looks quite similar to the GAN model. For the diameter of the GAN model, the following result is also derived in Ref. [6]. We say that a sequence of events {En,n≥1} occurs with high probability (w.h.p.) when P(En)→1 as n→∞.
Theorem 1. As n→∞, the diameter of the network GAN(n) is with high probability asymptotic to 2clogn with the constant
c=x0−1x0(log2x201−x0)−1=1.6738050⋯,
where x0=0.2550568⋯ is the unique solution in the interval (0,1) to the equation
x−log2x21−x=2.
In this paper, we further investigate another two types of distances in the GAN model: the typical distance and flooding time. The typical distance of a network is the distance between a pair of nodes picked in it uniformly at random (u.a.r.). For the typical distance of other random graph models, we refer to Refs. [15–18]. The flooding time of a network is the greatest distance from a randomly chosen node to other nodes. For more backgrounds and results of flooding time, see Refs. [19–22].
The rest of the paper is organized as follows. In Section 2, we analyze the structure of the subnetworks and obtain some results by building an auxiliary tree and using a continuous-time branching process. Based on these, we derive the results of the typical distance and flooding time of GAN models in Section 3.
2.
Subnetworks and auxiliary tree
When the GAN model is in its initial state, i.e., GAN(0), we can see that the ring is divided into three initial intervals, I1,I2, and I3 (see Fig. 2). We refer to the subnetwork made up of the nodes and edges in interval Ii, including the endpoints of the interval, as GANi, where i=1,2,3. It goes without saying that any pair of nodes’ paths involves a maximum of two subnetworks. There are only two scenarios to take into account for the typical distance of GAN(n): (A) Two nodes originate from the same initial internal; (B) two nodes originate from distinct initial internals. Indeed, we only need to consider the typical distance of one of the subnetworks and the distance between any initial node and the node selected u.a.r. in the subnetwork. Therefore, we need to concentrate on the subnetworks first.
Assuming that there are Y1,n,Y2,n, and Y3,n nodes in each of the three open intervals at time n, then Y1,n+Y2,n+Y3,n=n. We obtain that
Yi,nnd→Beta(1,2),i=1,2,3,
as n→∞, where Beta(α,β) denotes the beta distribution with parameters α and β. With high probability, the random variables Y1,n,Y2,n, and Y3,n have the same order of order n, as proven in Ref. [6]. Then, we can use the fact that
logYi,n=logn+Op(1).
(1)
The identical distribution of the three random variables is clear to notice. Without loss of generality, we restrict our attention to the subnetwork GAN1(n). We suppose that Y1,n=n0, where n0≥0 is an integer, and that a new node is added at each step in this subnetwork. As a result, the size of the subnetwork GAN1(n) can be set to n0+2, where n0 varies with n. GAN1(n) has two initial nodes that are linked together, hence the distances between any given node in the network and the two initial nodes can differ by up to one.
If we consider the first node added to the subnetwork to be the root of a binary tree and each time a new node is added, the two potential nodes produced are the two children of that node, we can create a binary tree according to this relationship. Afterwards, we may utilize a binary tree to obtain some of this subnetwork’s characteristics. Moreover, in each step, only one potential node is transformed into an actual node, and two new potential nodes are generated. This network’s creation resembles a unique continuous-time branching process in which each individual splits just once to produce two offspring.
2.1
Auxiliary tree structure
In this section, we construct an auxiliary binary tree and embed a branching process in the subnetwork GAN1(n). There is only one initial interval and one potential node in the initial state of subnetwork GAN1. An interval becomes two new active intervals when a new node u is added to it, and these two new intervals also correspond to two new potential nodes. The potential node that lies to the left (or right) of node u is called the left (or right) child of u. Instead, u is the parent of these two nodes. The ancestral line of node u is a path that leads from the first nodes added in GAN1(n) to u, where the previous node is the parent of the next node. The network in Fig. 3a can be redrawn using the shape of the binary tree to obtain Fig. 3b. The auxiliary tree in this paper is constructed in a manner similar to that in Ref. [6], but the nodes in the auxiliary tree correspond to the edges of the network in the latter. As shown in Fig. 3b, we can embed a continuous-time branching process (CTBP) (see, for example Refs. [23, 24]) into GAN1(n).
Figure
3.
(a) is the subnetwork GAN1 after adding several nodes without marking potential nodes where the nodes labeled 0 and 1 are the initial nodes. By redrawing the network, we can obtain (b). The nodes marked as black circles and the nodes marked as blue squares are actual nodes and potential nodes, respectively. Except for the blue dotted line, the solid lines and the dotted lines represent the ancestral line of each node and shortcuts, respectively. Furthermore, the black lines and red lines represent existing edges and potential edges, respectively. u0 is one of the potential nodes of this subnetwork.
Consider a CTBP as follows: At the beginning, there is a single individual who serves as the process’s root. This initial individual then splits into two individuals (producing two offspring) before becoming inactive. Each individual has an i.i.d. exponential lifespan with a mean of 1. As a result, after birth, each individual is active for the entirety of its lifespan before splitting, going inactive, and giving birth to two offspring who then become active for their own i.i.d. Exp(1) lifespans. There are n0+1 active individuals if n0 individuals split during the process. Let ti represent the time when the ith individual splits.
There is a mapping relation between the subnetwork GAN1(n) and the CTBP with n0 splitters as follows. The first node in GAN1(n) (i.e., the first node added to the subnetwork GAN1(n)) corresponds to the root of the CTBP. When the n0th individual in the CTBP splits, the already split individuals are the actual nodes in GAN1(n), while the individuals who are still active in the CTBP are the potential nodes in GAN1(n). This is true because, in each step, two new potential nodes are created in place of the actual node when a new node is added to GAN1(n). Furthermore, a potential node is randomly chosen in each step, which is equivalent to the CTBP’s next individual splitting being a randomly chosen active individual due to the memoryless nature of exponential variables. In fact, there is a maximum difference of one between any node’s distance from the first node and its distance from either of the initial nodes.
The generation of an individual in the CTBP is equivalent to the length of the ancestral line leading to the corresponding node in GAN1(n). For active individuals, we can directly apply the conclusion of Corollary 1.1 in Ref. [24]. Let Lu be the generation of the individual in the CTBP corresponding to potential node u selected u.a.r. in GAN1(n). Then as n0→∞,
Lu−2logn0√2logn0d→N(0,1).
(2)
2.2
Shortcuts
Obviously, for each node, there is a shortcut that connects it to other ancestors in addition to the connection to its parent. To calculate the distance between each pair of potential nodes, it is necessary to identify the shortcuts between the first node and each node.
The ancestral line shows one of the paths from the node to the first node in GAN1(n), and we can use the ancestral line to identify the shortcuts. We could also define a sequence for each node’s ancestral line to discover the rule governing the existence of shortcuts. First, the left and right child nodes are denoted by symbols l and r, respectively. The ancestral line of each node provides a sequence of l and r, where the ith symbol of the sequence denotes whether the (i+1)th ancestor is the left or right child of the ith ancestor. For instance, the sequence of the node labeled u0 in Fig. 3b is lrlrr. The sequence of the first node is represented by ∅. Each node’s sequence is therefore unique, with the exception of the initial nodes. The sequence is sometimes used to represent the node. Furthermore, instead of trying to distinguish between the two initial nodes, we use the symbol Λ to represent all initial nodes.
Assuming there are two nodes u and v, we can represent their sequences using the formulas u=u1...up and v=v1...vq, respectively, where ui,vj∈{l,r}, i=1,...,p,j=1,...,q, and p and q are the lengths of the sequence, i.e., |u|=p and |v|=q. We use ηi to represent the last position of i∈{l,r} in a sequence and define a truncation operator T,
Tiu:={u1⋯uηi−1,if symboliappears inthe sequence of the node;Λ,if symbolidoes not appear inthe sequence of the node.
The parent of u is obviously either Tlu or Tru. In particular, the first node has two parents, both of which are initial nodes. Along the ancestral line of u, the next node of Tlu is (Tlu)l, so u is the left offspring of Tlu, i.e., Tlu is to the right of u. Similarly, Tru is to the left of u.
Based on the definition of the sequence of nodes and the above assumptions, we obtain the following facts:
(a) The prefixes of the sequence of node u refer to the ancestors of u, i.e., the nodes with the sequence u1...uk, k=1,...,p−1, are ancestors of u.
(b) The sequence u∧v:={u1...uk:ui=vi,i=1,...,k;uk+1≠vk+1;k≤min represents the latest common ancestor of u and v , which can be written as u\land v .
(c) The shortest path between u and v must pass through some of their common ancestors. We must ascend to their common ancestors to determine the shortest path between them.
Claim 1.T_r{\boldsymbol{u}} and T_l{\boldsymbol{u}} are two ancestors of {\boldsymbol{u}} connected to {\boldsymbol{u}}, i.e., the neighbors of {\boldsymbol{u}} when node {\boldsymbol{u}} is added to the network as a new node.
Proof. We prove this claim by induction. The sequence of the first node is \emptyset , and its children are l and r . Then the result is valid for the first node. Assume that the claim is valid for the most recently added node {\boldsymbol{u}} , that is, {\boldsymbol{u}} links to T_r{\boldsymbol{u}} and T_l{\boldsymbol{u}} . The endpoints of the interval to the left of u are u and T_r{\boldsymbol{u}} . When the left child of {\boldsymbol{u}} , {\boldsymbol{u}}l , is added to the network, T_l({\boldsymbol{u}}l) = {\boldsymbol{u}} and T_r({\boldsymbol{u}}l) = T_r{\boldsymbol{u}} . These two results are exactly the endpoints of the interval of the left child of u . Therefore, the claim holds for {\boldsymbol{u}}l . Similarly, the claim also holds for {\boldsymbol{u}}r .
As a result, the shortcut connects any given node u to either T_r{\boldsymbol{u}} or T_l{\boldsymbol{u}} . For any node u , observe the difference between the sequence of nodes connected by the shortcut and the sequence of u . From back to front, this difference collects exactly two symbols. For example, the shortcut of node u_0 with the sequence lrlrr in Fig. 2 connects to the node with the sequence lr , and their difference is lrr . Consequently, we can partition the sequence. From the back to the front, when exactly two symbols have been collected completely, i.e. the occurrence of the sequence lr or rl , we divide them into a block and obtain a shortcut. The division is then repeated in the same way to obtain the number of blocks. The last block may not have a complete collection of two symbols, but this block represents a node that is 1 away from one of the initial nodes.
Reverse the sequence and define an event {\cal{E}} : the occurrence of the sequence lr or rl . Let Y be the number of symbols needed for event {\cal{E}} to occur for the first time in the reversed sequence. It is easy to calculate
When event {\cal{E}} occurs for the first time, we remove the symbols in the reverse sequence that precede it and those that represent it, and then we need to find the location of event {\cal{E}} in the remaining sequence. This operation is repeated until no event {\cal{E}} occurs in the remaining sequence. The number of times an event {\cal{E}} is repeated in the inverse sequence of a node is equal to the distance between this node and the initial node minus 1 . Suppose there is a sequence of length m and let S_m be the number of occurrences of event {\cal{E}} in that sequence. By (6.8) in Chapter XIII of Ref. [25],
Furthermore, we use L_u to represent the length of the ancestral line of node u . If node u is the ancestor of node v , then the shortest distance between u and v only needs to consider the difference in the two nodes’ sequences, and the length of the difference is L_u-L_v .
Lemma 1. Suppose that u is a potential node picked u.a.r. in {{\rm{GAN}}}_1(n) with size n_0+2 and let S_{L_u} be the shortest distance from u to the first node. If n_0\rightarrow\infty , S_{L_u} satisfies
According to formula (2), \dfrac{L_u}{2\log n_0}\xrightarrow{\rm p}1, as n_0\rightarrow\infty and the second term converges to {\cal{N}}(0,3/4) in distribution. Condition on L_u with L_u\rightarrow\infty , \dfrac{S_{L_u}-(1/3)L_u}{\sqrt{L_u}} converges to {\cal{N}}(0,1/4) in distribution by (3). Since S_{L_u}|L_u is independent of L_u , by Slutsky’s lemma, we can obtain conclusion (4).
2.3
Common ancestor
According to fact (c) in Section 2.2, the shortest path between any two potential nodes must pass through some of their common ancestors. To find the distance between any two potential nodes, we need to find the relationship between the distance of this pair of nodes and their latest common ancestor. First, we want to divide the shortest path into two parts.
Claim 2. For any pair of nodes u and v with sequences {\boldsymbol{u}} and {\boldsymbol{v}} , respectively,
Proof. For convenience, we use the sequence of the node to represent the node in the following.
If {\boldsymbol{u}} is the ancestor of {\boldsymbol{v}} or vice versa, then this conclusion is easy to reach. If not, based on the previous facts in Section 2.2, we can assume that the shortest path between them is {\boldsymbol{u}}\rightarrow...\rightarrow{\boldsymbol{x}}\rightarrow...\rightarrow{\boldsymbol{v}} where {\boldsymbol{x}} is either {\boldsymbol{u}}\land {\boldsymbol{v}} or one of its ancestors. For path {\boldsymbol{u}}\rightarrow...\rightarrow{\boldsymbol{x}} , each node is the parent of the previous node; for path {\boldsymbol{x}}\rightarrow...\rightarrow{\boldsymbol{v}} , each node is the parent of the next node. If {\boldsymbol{x}} is {\boldsymbol{u}}\land {\boldsymbol{v}} , the conclusion is obvious. If {\boldsymbol{x}} is one of the ancestors of {\boldsymbol{u}}\land {\boldsymbol{v}} , assume that the two nodes of the child of {\boldsymbol{u}}\land {\boldsymbol{v}} closest to {\boldsymbol{x}} are ({\boldsymbol{u}}\land {\boldsymbol{v}}){\boldsymbol{w}}_1 and ({\boldsymbol{u}}\land {\boldsymbol{v}}){\boldsymbol{w}}_2 , where {\boldsymbol{w}}_1 and {\boldsymbol{w}}_2 represent the sequences where all the symbols are the same. Because of symmetry, we can assume that {\boldsymbol{w}}_1 = l...l , {\boldsymbol{w}}_2 = r...r , |{\boldsymbol{w}}_1|\ge 0 , and |{\boldsymbol{w}}_2|\ge 0 .
Since it is impossible for the left offspring of {\boldsymbol{u}}\land {\boldsymbol{v}} and the right offspring of {\boldsymbol{u}}\land {\boldsymbol{v}} to have a shortcut to the same common ancestor at the same time, except for {\boldsymbol{u}}\land {\boldsymbol{v}} , there must be another common ancestor (defined as {\boldsymbol{y}} ) in this path. Without loss of generality, we can write the path as
Obviously, there is a shortcut between ({\boldsymbol{u}}\land {\boldsymbol{v}}){\boldsymbol{w}}_1 and {\boldsymbol{x}} and between ({\boldsymbol{u}}\land {\boldsymbol{v}}){\boldsymbol{w}}_2 and {\boldsymbol{y}} . T_r(({\boldsymbol{u}}\land {\boldsymbol{v}}){\boldsymbol{w}}_1) = T_r({\boldsymbol{u}}\land {\boldsymbol{v}}) = {\boldsymbol{x}} and T_l(({\boldsymbol{u}}\land {\boldsymbol{v}}){\boldsymbol{w}}_2) = T_l({\boldsymbol{u}}\land {\boldsymbol{v}}) = {\boldsymbol{y}} means that there is a shortcut between {\boldsymbol{u}}\land {\boldsymbol{v}} and {\boldsymbol{x}} and there is at most one shortcut between {\boldsymbol{u}}\land {\boldsymbol{v}} and {\boldsymbol{y}} . There is a path through {\boldsymbol{u}}\land {\boldsymbol{v}} ,
where {\boldsymbol{u}}\rightarrow...\rightarrow({\boldsymbol{u}}\land {\boldsymbol{v}}){\boldsymbol{w}}_1 and ({\boldsymbol{u}}\land {\boldsymbol{v}}){\boldsymbol{w}}_2\rightarrow...\rightarrow{\boldsymbol{v}} are the same as those in path (7).
Since path (7) is the shortest path and it is possible that node {\boldsymbol{y}} is node {\boldsymbol{u}}\land {\boldsymbol{v}} , the distance between {\boldsymbol{x}} and {\boldsymbol{y}} is at least 1. Therefore, if we look for the shortest path through {\boldsymbol{u}}\land {\boldsymbol{v}} , the length of this path differs from the length of the path (7) by at most one. The proof of Claim 2 is complete.
Next, we focus on the length of the sequence difference between two nodes and their latest common ancestor. For any pair of nodes u and v picked u.a.r. in GAN _1(n) with sequences {\boldsymbol{u}} and {\boldsymbol{v}} , respectively, assume that the individual in CTBP corresponding to u\land v was born at the \tau_{u\land v} th split. According to Section 2 in Ref. [24], we can write L_u =\displaystyle \sum\limits_{i = 1}^{n_0} I_i, where I_i are conditionally independent and equal 0 or 1 depending on whether the individual in the ancestral line is newborn at time t_i . Since we know that the number of active individuals (potential nodes) generated at each split is 2 and the total number of active individuals at the i th split is i+1 , we can see that the indicators I_i are independent and \mathbb{P}(I_i = 1) = \dfrac{2}{i+1}. In the same way, we define L_{v} = \displaystyle \sum\limits_{i = 1}^{n_0} I'_i. Note that the event (I_i,I'_i) = (1,1) means that the ancestral lines of two nodes merge at the i th split. The joint conditional distribution of I_i and I'_i can be written as
that \tau_{u\land v} has a limiting distribution, and (\log n_0-\log \tau_{u\land v})/\log n_0 converges to 1 in probability. This also means that L_{u\land v} has a limiting distribution, independent of n_0 . Considering the length of the sequence difference between a potential node and its ancestor, we can obtain the following result.
Proposition 1. Suppose u and v are two potential nodes picked u.a.r. in {\rm{GAN}}_1(n) with size n_0+2 , then,
Using Lindeberg central limit theorem and Eqs. (8)–(10) for linear combinations of \displaystyle \sum\limits_{i = \tau_{u\land v}}^{n_0}(a I_i+b I'_i), where a and b are two arbitrary constants, the two variables in (11) converge jointly to a two-dimensional standard normal variable in distribution.
3.
Typical distance and flooding time
As prepared in the previous section, we now consider the typical distance and flooding time in the GAN model. We first state our main results in the following.
Theorem 2. (Typical distance) Let D_n be the typical distance of {{\rm{GAN}}}(n) . Then as n\rightarrow\infty ,
To prove these two theorems, we need to use several lemmas as follows. In fact, we can define the radius of the network GAN as the length of the longest path from one of the initial nodes since the longest path of the network GAN must pass through one of the initial nodes.
Lemma 2[6]. The radius of the subnetwork {{\rm{GAN}}}_i(n) is w.h.p. asymptotic to c\log n with the constant c is defined in Theorem 1.
Lemma 3. Let R_{n_0} be the distance between the node picked u.a.r. in GAN _1(n) and either of the initial nodes. If n_0\rightarrow\infty , then
Proof. For any existing node picked u.a.r. in GAN _1(n) , there is a potential node that makes the distance between these two nodes equal to 1. By Lemma 1, the conclusion (13) is easy to obtain.
Lemma 4. Let D_n^1 be the typical distance of GAN _1(n) . If n_0\rightarrow\infty , then
Proof. Pick a pair of potential nodes u and v u.a.r. from GAN _1(n) whose sequences are {\boldsymbol{u}} and {\boldsymbol{v}} , respectively, and u\land v is their latest common ancestor with the sequence {\boldsymbol{u}}\land{\boldsymbol{v}} . In the following descriptions, we will directly represent nodes as the sequence of nodes for convenience. Define the distinct postfixes \tilde{{\boldsymbol{u}}},\tilde{{\boldsymbol{v}}} after {\boldsymbol{u}}\land {\boldsymbol{v}} by
Since S_{L_{u}-L_{u\land v}}|L_{u}-L_{u\land v} is independent of L_{u}-L_{u\land v} and the lengths of the sequences \tilde{{\boldsymbol{u}}} and \tilde{{\boldsymbol{v}}} are independent, we can obtain the following conclusion in the same way as the proof of Lemma 1.
By conditioning first on L_{u\land v} and using the fact that the symbols in the sequence \tilde{{\boldsymbol{u}}} and \tilde{{\boldsymbol{v}}} are i.i.d., it can be shown that (S_{L_{u}-L_{u\land v}}-{(2/3)}\log n_0, \; S_{L_{v}-L_{u\land v}}-{(2/3)}\log n_0)/\sqrt{\log n_0} converges jointly to two independent copies of {\cal{N}}(0,8/27) in distribution.
By Eq. (14), the distance between two potential nodes picked u.a.r. in GAN _1(n) has the following property.
As shown in Fig. 2, each node in {\rm GAN}_1(n) is connected to two potential nodes (except the initial nodes). For randomly selected nodes u' and v' in GAN _1(n) with sequences {\boldsymbol{u}}' and {\boldsymbol{v}}' , respectively, suppose that the nearest potential nodes are {\boldsymbol{u}} and {\boldsymbol{v}} , respectively. Obviously, node {\boldsymbol{u}}' ({\boldsymbol{v}}' ) is one of the ancestors of potential node {\boldsymbol{u}}({\boldsymbol{v}}). Then,
where w_1 and w_2 represent the difference between the two sequences. Define the distinct postfixes \tilde{{\boldsymbol{u}}}',\tilde{{\boldsymbol{v}}}' after {\boldsymbol{u}}'\land {\boldsymbol{v}}' by
Therefore, the distance between any pair of nodes selected u.a.r. from GAN _1(n) satisfies the asymptotic normality (15). The proof of this lemma is now complete.
Proof of Theorem 1. For scenario (A), as n\rightarrow\infty , since D_n^1 is the typical distance of the subnetwork GAN _1(n) , we can use fact (a) and Lemma 4 to obtain
This means that the distance between a randomly chosen pair of nodes in a randomly chosen initial interval has the above property due to the symmetry of GAN( n ).
For scenario (B), the path between any two nodes that originate from two different initial intervals must pass through one of the initial nodes. Consider these two nodes to have originated from {\boldsymbol{I}}_1 and {\boldsymbol{I}}_2 , respectively. Because of the independence of each node in GAN _1 and GAN _2 , we can see that the distance between the two nodes has the same distribution as 2R_{n_0} , which is mentioned in Lemma 3. By (13) and (1), as n\rightarrow\infty , 2R_{n_0} satisfies
Therefore, the distances between any two randomly selected nodes from two of any randomly selected initial internals satisfy the above property.
After combining the conclusions of these two scenarios, it is not difficult to conclude that the distance between pairs of nodes in GAN( n ) satisfies (12).
Proof of Theorem 2. The flooding time in GAN (n) can be expressed as
\max\limits_{v}{\rm{dist}}( u,v),
where u is a node picked u.a.r. in GAN (n) . With high probability, the node most distant from u is in another initial interval. Therefore,
Without loss of generality, we consider the properties of the distance in {\rm GAN}_1. By Lemma 3, {\mathbb{E}}[R_{n_0}]\sim \dfrac{2}{3}\log n_0 and {\rm{Var}}(R_{n_0})\sim \dfrac{8}{27}\log n_0. If n_0\rightarrow\infty , by Chebyshev’s inequality,
This work was supported by the National Natural Science Foundation of China (11771418).
Conflict of interest
The authors declare that they have no conflicts of interest.
Conflict of Interest
The authors declare that they have no conflict of interest.
The asymptotic properties of the typical distance and the flooding time in a geographic attachment network (GAN) model are studied.
The typical distance of GAN is asymptotically normal.
The flooding time of GAN converges to a given constant in probability.
Wasserman S, Faust K. Social Network Analysis: Methods and Applications. Cambridge, UK: Cambridge University Press, 1994.
[2]
Wu J, Tse C K, Lau F C M, et al. Analysis of communication network performance from a complex network perspective. IEEE Transactions on Circuits and Systems,2013, 60: 3303–3316. DOI: 10.1109/TCSI.2013.2264697
[3]
Newman M E J. The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences of the United States of America,2001, 98: 404–409. DOI: 10.1073/pnas.98.2.404
[4]
Alm E, Arkin A P. Biological networks. Current Opinion in Structural Biology,2003, 13: 193–202. DOI: 10.1016/S0959-440X(03)00031-9
[5]
Ozik J, Hunt B R, Ott E. Growing networks with geographical attachment preference: Emergence of small worlds. Physical Review E,2004, 69: 026108. DOI: 10.1103/PhysRevE.69.026108
[6]
Feng Q, Wang Y, Hu Z. Small-world effect in geographical attachment networks. Probability in the Engineering and Informational Sciences,2021, 35: 276–296. DOI: 10.1017/S0269964819000342
[7]
Hayashi Y. A review of recent studies of geographical scale-free networks. Information and Media Technologies,2006, 1: 1136–1145. DOI: 10.11185/imt.1.1136
[8]
Zhang Z, Rong L, Comellas F. Evolving small-world networks with geographical attachment preference. Journal of Physics A: Mathematical and General,2006, 39: 3253. DOI: 10.1088/0305-4470/39/13/005
[9]
Zhang Z, Rong L, Guo C. A deterministic small-world network created by edge iterations. Physica A: Statistical Mechanics and its Applications,2006, 363: 567–572. DOI: 10.1016/j.physa.2005.08.020
[10]
Barabási A L, Albert R. Emergence of scaling in random networks. Science,1999, 286: 509–512. DOI: 10.1126/science.286.5439.509
[11]
Kolossváry I, Komjáthy J, Vágó L. Degrees and distances in random and evolving Apollonian networks. Advances in Applied Probability,2016, 48: 865–902. DOI: 10.1017/apr.2016.32
[12]
Zhou T, Yan G, Wang B. Maximal planar networks with large clustering coefficient and power-law degree distribution. Physical Review E,2005, 71: 046141. DOI: 10.1103/PhysRevE.71.046141
[13]
Andrade Jr J S, Herrmann H J, Andrade R F, et al. Apollonian networks: simultaneously scale-free, small world, Euclidean, space filling, and with matching graphs. Physical Review Letters,2005, 94: 018702. DOI: 10.1103/PhysRevLett.94.018702
[14]
Zhang Z, Rong L, Zhou S. Evolving Apollonian networks with small-world scale-free topologies. Physical Review E,2006, 74: 046105. DOI: 10.1103/PhysRevE.74.046105
[15]
Abdullah M A, Bode M, Fountoulakis N. Typical distances in a geometric model for complex networks. arXiv: 1506.07811, 2015
[16]
Dereich S, Mönch C, Mörters P. Typical distances in ultrasmall random networks. Advances in Applied Probability,2012, 44: 583–601. DOI: 10.1239/aap/1339878725
[17]
Bhamidi S, van der Hofstad R, Hooghiemstra G. First passage percolation on the Erdös–Rényi random graph. Combinatorics, Probability and Computing,2011, 20: 683–707. DOI: 10.1017/S096354831100023X
[18]
Bhamidi S, van der Hofstad R. Weak disorder asymptotics in the stochastic mean-field model of distance. Advances in Applied Probability,2012, 22: 29–69. DOI: 10.1214/10-AAP753
[19]
van der Hofstad R, Hooghiemstra G, van Mieghem P. The flooding time in random graphs. Extremes,2002, 5: 111–129. DOI: 10.1023/A:1022175620150
[20]
Camargo D, Popov S. Total flooding time and rumor propagation on graphs. Journal of Statistical Physics,2017, 166: 1558–1571. DOI: 10.1007/s10955-017-1731-0
[21]
Amini H, Draief M, Lelarge M. Flooding in weighted sparse random graphs. SIAM Journal on Discrete Mathematics,2013, 27: 1–26. DOI: 10.1137/120865021
[22]
Mountford T, Saliba J. Flooding and diameter in general weighted random graphs. Journal of Applied Probability,2020, 57: 956–980. DOI: 10.1017/jpr.2020.45
[23]
Athreya K B, Ney P E. Branching Processes. Berlin: Springer, 1972.
[24]
Bühler W J. Generations and degree of relationship in supercritical Markov branching processes. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete,1971, 18: 141–152. DOI: 10.1007/BF00569184
[25]
Feller W. An Introduction to Probability Theory and Its Applications, Vol. 1. 3rd Edition. New York: Wiley, 2008.
Figure
1.
Illustration of the growing GAN model with potential nodes for time n = 0 , 1, and 2, where points \bullet represent nodes in the network, points \circ are potential nodes, and red dashed lines are potential edges.
Figure
2.
Initial internode internals in GAN(0).
Figure
3.
(a) is the subnetwork {{\rm{GAN}}}_1 after adding several nodes without marking potential nodes where the nodes labeled 0 and 1 are the initial nodes. By redrawing the network, we can obtain (b). The nodes marked as black circles and the nodes marked as blue squares are actual nodes and potential nodes, respectively. Except for the blue dotted line, the solid lines and the dotted lines represent the ancestral line of each node and shortcuts, respectively. Furthermore, the black lines and red lines represent existing edges and potential edges, respectively. u_0 is one of the potential nodes of this subnetwork.
References
[1]
Wasserman S, Faust K. Social Network Analysis: Methods and Applications. Cambridge, UK: Cambridge University Press, 1994.
[2]
Wu J, Tse C K, Lau F C M, et al. Analysis of communication network performance from a complex network perspective. IEEE Transactions on Circuits and Systems,2013, 60: 3303–3316. DOI: 10.1109/TCSI.2013.2264697
[3]
Newman M E J. The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences of the United States of America,2001, 98: 404–409. DOI: 10.1073/pnas.98.2.404
[4]
Alm E, Arkin A P. Biological networks. Current Opinion in Structural Biology,2003, 13: 193–202. DOI: 10.1016/S0959-440X(03)00031-9
[5]
Ozik J, Hunt B R, Ott E. Growing networks with geographical attachment preference: Emergence of small worlds. Physical Review E,2004, 69: 026108. DOI: 10.1103/PhysRevE.69.026108
[6]
Feng Q, Wang Y, Hu Z. Small-world effect in geographical attachment networks. Probability in the Engineering and Informational Sciences,2021, 35: 276–296. DOI: 10.1017/S0269964819000342
[7]
Hayashi Y. A review of recent studies of geographical scale-free networks. Information and Media Technologies,2006, 1: 1136–1145. DOI: 10.11185/imt.1.1136
[8]
Zhang Z, Rong L, Comellas F. Evolving small-world networks with geographical attachment preference. Journal of Physics A: Mathematical and General,2006, 39: 3253. DOI: 10.1088/0305-4470/39/13/005
[9]
Zhang Z, Rong L, Guo C. A deterministic small-world network created by edge iterations. Physica A: Statistical Mechanics and its Applications,2006, 363: 567–572. DOI: 10.1016/j.physa.2005.08.020
[10]
Barabási A L, Albert R. Emergence of scaling in random networks. Science,1999, 286: 509–512. DOI: 10.1126/science.286.5439.509
[11]
Kolossváry I, Komjáthy J, Vágó L. Degrees and distances in random and evolving Apollonian networks. Advances in Applied Probability,2016, 48: 865–902. DOI: 10.1017/apr.2016.32
[12]
Zhou T, Yan G, Wang B. Maximal planar networks with large clustering coefficient and power-law degree distribution. Physical Review E,2005, 71: 046141. DOI: 10.1103/PhysRevE.71.046141
[13]
Andrade Jr J S, Herrmann H J, Andrade R F, et al. Apollonian networks: simultaneously scale-free, small world, Euclidean, space filling, and with matching graphs. Physical Review Letters,2005, 94: 018702. DOI: 10.1103/PhysRevLett.94.018702
[14]
Zhang Z, Rong L, Zhou S. Evolving Apollonian networks with small-world scale-free topologies. Physical Review E,2006, 74: 046105. DOI: 10.1103/PhysRevE.74.046105
[15]
Abdullah M A, Bode M, Fountoulakis N. Typical distances in a geometric model for complex networks. arXiv: 1506.07811, 2015
[16]
Dereich S, Mönch C, Mörters P. Typical distances in ultrasmall random networks. Advances in Applied Probability,2012, 44: 583–601. DOI: 10.1239/aap/1339878725
[17]
Bhamidi S, van der Hofstad R, Hooghiemstra G. First passage percolation on the Erdös–Rényi random graph. Combinatorics, Probability and Computing,2011, 20: 683–707. DOI: 10.1017/S096354831100023X
[18]
Bhamidi S, van der Hofstad R. Weak disorder asymptotics in the stochastic mean-field model of distance. Advances in Applied Probability,2012, 22: 29–69. DOI: 10.1214/10-AAP753
[19]
van der Hofstad R, Hooghiemstra G, van Mieghem P. The flooding time in random graphs. Extremes,2002, 5: 111–129. DOI: 10.1023/A:1022175620150
[20]
Camargo D, Popov S. Total flooding time and rumor propagation on graphs. Journal of Statistical Physics,2017, 166: 1558–1571. DOI: 10.1007/s10955-017-1731-0
[21]
Amini H, Draief M, Lelarge M. Flooding in weighted sparse random graphs. SIAM Journal on Discrete Mathematics,2013, 27: 1–26. DOI: 10.1137/120865021
[22]
Mountford T, Saliba J. Flooding and diameter in general weighted random graphs. Journal of Applied Probability,2020, 57: 956–980. DOI: 10.1017/jpr.2020.45
[23]
Athreya K B, Ney P E. Branching Processes. Berlin: Springer, 1972.
[24]
Bühler W J. Generations and degree of relationship in supercritical Markov branching processes. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete,1971, 18: 141–152. DOI: 10.1007/BF00569184
[25]
Feller W. An Introduction to Probability Theory and Its Applications, Vol. 1. 3rd Edition. New York: Wiley, 2008.