Liyuan Lin received his M.S. degree from the University of Science and Technology of China in 2024. His research mainly focuses on two-sided markets and opaque selling
Motivated by the business model called “community group buying” (CGB), which has emerged in China and some countries in Southeast Asia, such as Singapore and Indonesia, we develop algorithms that could help CGB platforms match consumers with stage-stations (the picking up center under the CGB mode). By altering the fundamental design of the existing hierarchy algorithms, improvements are achieved. It is proven that our method has a faster running speed and greater space efficiency. Our algorithms avoid traversal and compress the time complexities of matching a consumer with a stage-station and updating the storage information to O(logM) and O(MlogG), where M is the number of stage-stations and G is that of the platform’s stock-keeping units. Simulation comparisons of our algorithms with the current methods of CGB platforms show that our approaches can effectively reduce delivery costs. An interesting observation of the simulations is worthy of note: Increasing G may incur higher costs since it makes inventories more dispersed and delivery problems more complicated.
Graphical Abstract
The fundamental processes of a community group buying platform.
Abstract
Motivated by the business model called “community group buying” (CGB), which has emerged in China and some countries in Southeast Asia, such as Singapore and Indonesia, we develop algorithms that could help CGB platforms match consumers with stage-stations (the picking up center under the CGB mode). By altering the fundamental design of the existing hierarchy algorithms, improvements are achieved. It is proven that our method has a faster running speed and greater space efficiency. Our algorithms avoid traversal and compress the time complexities of matching a consumer with a stage-station and updating the storage information to O(logM) and O(MlogG), where M is the number of stage-stations and G is that of the platform’s stock-keeping units. Simulation comparisons of our algorithms with the current methods of CGB platforms show that our approaches can effectively reduce delivery costs. An interesting observation of the simulations is worthy of note: Increasing G may incur higher costs since it makes inventories more dispersed and delivery problems more complicated.
Public Summary
We develop algorithms to help community group buying platforms compress the time complexities of matching a consumer with a stage-stations and updating the storage information.
With simulations, we make comparisons between our algorithms and the currently usual methods of community group buying platforms and find that our approaches can effectively reduce delivery costs.
With more stock keeping uinits, a community group buying platform may bear higher delivery costs.
Since 2020, the outbreak of COVID-19 has greatly changed Chinese people’s purchasing habits. Among the several novel e-commerce modes that have been propelled to emerge and prevail by such changes, community group buying (CGB) is the one that has grown rapidly. IiMedia’s reports forecasted that the market size of CGB in China might achieve USD 14 billion in 20221. This number proved to be USD 29 billion in 2022 and 44 billion in 20232, much more than the previous predictions. Seeing such positive development results, IiMedia reannounced that CGB platforms are playing increasingly important roles between consumers and business entities and in optimizing resource distribution in 20243. The number of CGB users has reached 678 million in China in 20232. Meituan Select, a leading CGB platform, has covered more than 2000 counties in China and the cumulative number of trading users has meet 470 million in 20234. The success of CGB in China has brought much transformation, especially for remote Chinese towns5. It has also attracted attention from other countries in Southeast Asia. Several platforms have emerged in Indonesia (Super, Dagangan), Vietnam (Aemi, Selly), the Philippines (Sarisuki), and Singapore (We-buy), and the CGB market in Southeast Asia is about USD 1.4 billion in 2023 according to Bain6. Mounting evidence indicates that the CGB industry is attracting considerable interest among operations management and marketing professionals in academia and industry[1].
Under the CGB mode, the platform will employ existing, small, family-owned shops to act as its stage-stations. The shop owners are called the agents in this paper. Consumers place orders on the platform and choose stage-stations. Before the scheduled time (usually the next evening), the goods will be delivered to the chosen stage-stations, and consumers will go there and pick up the goods for themselves, which naturally addresses the last-mile delivery problem. By sharing 10% to 30% of the revenue of each order, the platform can utilize the spare spaces in the agents’ shops, which solves the problems of high logistics costs and spoilage in groceries7. To reduce delivery costs, the platform builds distributed stock-keeping units (SKUs, or grid-warehouses) to store goods and deploy the divide-and-conquer strategy for delivery. Meituan Select, for example, the largest CGB platform in China, built more than 1500 grid-warehouses in 2021 and expanded the number to 3000 in the first half of 2022. Duoduo Maicai, a smaller Chinese CGB unit, has fewer grid-warehouses, but this number also surpassed 1000 in 20218.
Despite the expanding market size and effective operation mode, one of the most vital focuses of CGB platforms is to further decrease costs9. Among all the kinds of costs, the delivery costs between the grid-warehouses and the stage-stations accounts for a vital part. To reduce this part of costs, the platform needs to match the demands with the stage-stations more accurately so that the delivery routes from grid-warehouses to stage-stations are more appropriate and incur less costs. The allocation of demands largely depends on the platform’s recommendation of stage-stations to consumers. At present, upon a consumer placing an order, CGB platforms always recommend the closest stage-station to him/her. Although the final choice is made by the consumer, he/she has to click and scan to learn more information about other stage-stations. Recommending the closest station is effort-saving for both platforms and consumers. However, can such a method necessarily minimize the delivery costs? If not, are there any other matching methods that could do better? These questions motivate this paper, which aims to find an algorithm that can help CGB platforms match consumers and stage-stations and reduce delivery costs.
The problem can be classified as a spatial matching problem, which considers the matching of different groups with spatial positions. Examples include passengers and available drivers in ride-hailing, delivery riders and consumers in takeaway service, bikes and users in bike-sharing. Previous studies have developed theories and proposed methods for such problems. However, these studies mainly focus on consumers and supplies and explore what influences may be brought about when the arrival process[2, 3] or the spatial dimension changes[4–6]. Under the CGB background, a platform aims to match its consumers with stage-stations to minimize delivery costs, which inevitably involves the third side: grid-warehouses. The inventory information in grid-warehouses and the relative positions with stage-stations make the matching problem more complicated and cannot be well addressed by the existing algorithms and data structures, which leaves a research gap and motivates our research.
To fill this gap, based on the hierarchy algorithm proposed by Ref. [6], we develop our algorithms and data structures to store the position and inventory information for the recommendation. We prove that with our algorithms, the time complexities of dynamic recommendation and updating the storage information are O(logM) and O(MlogG), respectively, where M is the number of stage-stations and G is that of grid-warehouses. We also present visual conclusions about how our algorithms surpass previous algorithms in running speed and space efficiency. We also conduct simulations to compare the delivery costs of our method with those of the currently used method. The results show that the current method of matching a consumer with the closest stage-station may not always minimize delivery costs. Because this method might allocate the demands on too many stage-stations, which expands the sizes of delivery problems. Our method, with the new algorithms storing, updating, and searching for all kinds of information, improves the allocations of demands. As a result, demands are more concentrated on less stage-stations, which downsizes the delivery problems and therefore decreases delivery costs. Our work contributes to previous research on spatial matching by improving the existing algorithms. We obtain results that provide managerial suggestions for CGB platforms on stage-station recommendation and decisions on the numbers of stations and warehouses. We shed light on future research concerning the dynamic matching of demand and supply and the last-mile delivery problem.
The remainder of this paper is organized as follows. First, we review prior literature and provide our motivation in Section 2. Then, we present our hierarchy algorithms and explain their feasibility in Sections 3 and 4. We show our simulation results and graphical demonstrations in Section 5. Finally, in Section 6, we provide concluding remarks on our work.
2.
Literature review
Our work is essentially related to three streams of literature. First, we draw from and build upon previous studied on spatial matching. Ajtai et al.[4] studied the spatial matching in two dimensions. They considered 2N independently distributed points and determined how to optimally make N pairs of matching. Talagrand[5] focused on the same problem and extended the results to three and more dimensions. There is also a rich line of work on static matching between a Poisson process and the Lebesgue measure in Rd. Hoffman et al.[7] explored the stable matching between two measures and proved the lower bounds on one typical allocation distance. Chatterjee et al.[2] conducted their research under the same basic settings but considered gravitational allocation. They show exponential decay of the “allocation diameter”. Markó and Timár[8] further discussed the same problem and obtained a Poisson allocation with optimal tail behavior. They got inspiration from Ref. [4] and proposed a distinct approach. Holden et al.[3] focused on gravitational matching for uniform points and recovered the previous results for 2-dimensional static spatial matching.
The above studies provide comprehensive results for the static matching problem. Another important research topic of spatial matching is semi-dynamic matching. Shor[9, 10] considered the average-case dynamic bin packing problem. He proved that the dynamic bin packing problem is related to static spatial matching in two dimensions. Ref. [10] spiritually resembles the approach of the hierarchy algorithm. In recent years, with the increasing prevalence of the sharing economy and the growth of online matching platforms, a growing number of studies have tried to propose better methods for the dynamic matching of demand and supply. Akbarpour et al.[11] considered a semi-dynamic model in one dimension. They obtained an important observation that excessive supply can effectively reduce the matching distance. They also proved an upper bound for that average match distance under the greedy algorithm. Ref. [4] was the first to construct the theory of hierarchy algorithm. It proved the upper and lower bounds of the matching costs for all dimensions and gave the most general results.
Together, the above studies inspire us to develop new algorithms. In our work, the matching of consumers and stage-stations is also a dynamic process. Under the background of CGB, the matching for one consumer involves multiple stations and grid-warehouses, which is more complicated. Most of the existing studies do not consider the third side other than consumers and supplies, which makes their algorithms unsuitable for our problems.
The second stream of related literature concerns CGB. As a prevailing business model whose market has experienced explosive growth in the post-COVID-19 pandemic era, CGB has attracted the attention of academia. Sun and Zhang[12] used a stylized game-theoretical model to investigate the optimal operation strategies for nanostores in CGB. They discussed the conditions under which the competitive (cooperative) strategy will be preferred. Li et al.[13] considered the social attributes of CGB and introduced the social reinforcement effect to construct a G-SCIR model of PSQ propagation. Yu et al.[1] addressed a real-world delivery problem of perishable products faced by a CGB operator. They proposed a branch-and-price algorithm that is verified to be effective by randomly generated instances.
In a word, the existing studies on CGB come from multiple perspectives. However, to the best of our knowledge, few studies have considered the station selection and recommendation problem and improved the platform’s decisions from this perspective. Therefore, our work tries to fill this gap.
This paper also has points of contact with research on the last-mile delivery problem, which has attracted growing attention in the past decade. Sharing mobility is regarded as a valid solution for the last-mile delivery problem and has been discussed by a great number of studies. Li et al.[14] proposed a framework for utilizing taxis to transport people and deliver parcels. Qi et al.[15] studied approaches to appropriately divide a target region into several subregions and used shared mobility to perform the last-mile delivery within each subregion. Agussurja et al.[16] developed a two-level planning model that incorporates the use of ride-sharing to satisfy last-mile demands while making real-time clustering and routing decisions. Cao et al.[17] built a model for optimizing the last-mile delivery problem that combines the use of ride-sharing platforms with traditional in-house van delivery systems.
In addition to ride-sharing, other approaches to the last-mile delivery problem have also drawn attention from academia. Mousavi et al.[18] considered a strategy of crowdsourcing and explored methods to optimally select mobile depot locations in advance of full information about the availability of crowd-shippers. They modeled the problem as a two-stage stochastic integer programming that incorporates the uncertainty in the crowd-shipper availability. Lyu and Teo[19] studied the case of a “Locker Alliance” network, a concept proposed by the Singapore government to build an interoperable network of public lockers in residential areas and hot spots in the community to improve the efficiency of last-mile parcel delivery operations. Reed et al.[20] explored in what geographies delivery assisted by autonomous vehicles is the most valuable for last-mile delivery. They integrated real-world data in a case study to build insights across urban-to-rural settings. The CGB mode creates an innovative approach to solve the last-mile delivery problem, where a platform makes use of existing shops as potential stations, and buyers help themselves pick up goods, which naturally solves the problem. Our work aims to further improve such an approach in the process of selecting and recommending stations, providing a new prospective for future research.
3.
Problem description and notations
We consider a CGB platform possessing G grid-warehouses and having employed M existing shops to be its stage-stations. We assume that all grid-warehouses, stage-stations, and consumers are located in region R1=[0,1]2. The positions of the jth station and the kth warehouse are denoted as (xSj,ySj) and (xWk,yWk), respectively. The initial inventory of goods in grid-warehouse k is IVk. There will be N consumers placing orders on the platform sequentially in one day. We assume that each consumer has demand of one unit. The reason is that a consumer with demand of D>1 can be regarded as D consumers with the same locations, each of whom has a demand of 1. Let G∑k=1IVk⩾N, i.e., there will always be at least one grid-warehouse with inventory to satisfy the need of the emerging consumer. Such an assumption fits reality well. In fact, the platform can utilize previous data on users’ purchases, make inferences about the amounts of future demands and prepare enough goods in the grid-warehouses to avoid the extra fees caused by storage shortages. Moreover, the inventory shortage will not affect the essential problems of our study. Upon a consumer placing an order, the platform needs to immediately search for a stage-station to make the recommendation. The consumer can accept the platform’s recommendation or choose other stage-stations. Either way, the consumer’s order will be distributed to a stage-station. Before the scheduled time, the good will be sent to this stage-station from one of the warehouses, and the consumers can go for picking up. Table 1 summarizes the notations.
Currently, it is universal that the platform directly recommends the station closest to the consumer. Such an operation indeed facilitates consumers. However, it may leave the platform to face higher transportation costs.
As shown in Fig. 1, S1 and S2 are the stations closest to consumers C1 and C2, respectively. If the consumers both accept the recommendations of the closest stations, the platform’s vehicle would have to travel to both S1 and S2 (Fig. 1a). However, if the platform recommends S1 to C2 from the first place and C2 accepts such a recommendation, the travel distance can be reduced effectively (Fig. 1b). With this example, we know that there is a space for improvement of the platform’s station recommendation. The problem is how to perform better. Specifically, when a consumer arrives, how can the platform choose one stage-station that might cause lower delivery costs than recommending the closest stage-station?
To answer these questions, we develop our hierarchy algorithms and data structures to provide a feasible approach. We split the problem into two parts: ① Finding candidate stage-stations for an arrived consumer. ② Considering the inventory information of grid-warehouses, one of the candidate stage-stations for recommendation is selected. These two parts are addressed by our Algorithms 1 and 2, which will be presented in the following section in sequence.
4.
Hierarchy algorithms
4.1
The core idea
The theory of hierarchy algorithm was first proposed by Kanoria[6]. He dedicates to solving the problem that has emerged from the prevalence of sharing and platform economies, which is the matching of demands and supplies. Kanoria considers units of demands and supplies in [0,1]d(d⩾1), and each demand unit needs to be matched with a supply. He deploys hierarchy algorithm, which first recursively “cuts” the edges into two parts and creates multiple “cubes” to a certain level. The position information of all the units is also recursively stored in these cubes. By doing so, the divide-and-conquer strategy can be deployed when a consumer arrives. The detailed process and how “divide-and-conquer” acts are presented in Supporting Information. The core idea of Kanoria can be demonstrated in Fig. 2.
Based on Ref. [6], we make innovations and establish the methods that fit our problem better, which are what we will state below.
4.2
The hierarchy of regions
In this subsection, we first present our improvement of Kanoria’s work and the main difference. Then, we give a theorem about why our approach is better. Finally, we establish Algorithm 1, where our approach is utilized to find candidate stage-stations for an arrived consumer with a random position and give theorems about its feasibility, time complexity, and space complexity.
Similar to Kanoria, we make partitions of the regions to a certain level. However, unlike his work, where the recursion process is performed by cutting the edges into two equal parts, we conduct the process by dividing the edges into three equal parts. Fig. 3 shows our idea.
For a region Rf, we divide the edge of Rf in each dimension into three equal parts, which creates 32=9 “son-regions” of Rf. After indexing all the son-regions, we perform the dividing and creating process recursively. There will be 9n regions in level n. We stop the process at level L if L satisfies 9L⩾M. In other words, we hope that L satisfies that there is approximately one stage-station in each region of level L, which makes L=⌈log9M⌉. We can also call the regions cubes. The level of a region Rf is denoted as Lev(Rf). Fig. 4 demonstrates the case of M=600 and the cubes in the deepest level, where the colorful scatters are the stage-stations.
Figure
4.
The cubes in the deepest level for M=600.
For simplicity and capturing the characteristics, we call the core idea of dividing the edges into three equal parts “ternary-dividing” and that of Kanoria “binary-dividing”. We have the following results.
Theorem 1. (i) The running speed of inserting and searching under ternary-dividing is faster than that under binary-dividing. (ii) The space efficiency of ternary-dividing is also higher than that of binary-dividing.
The main reason for ternary-dividing being better on running speed and storage space is that it balances the number of levels and size of each level. More detailed explanations and proofs are provided in Supporting Information.
Next, with pseudocode, we will present how we deploy ternary-dividing to select candidate stations after a consumer places an order.
Theorem 2. (The correctness, time, and space complexities of the core parts of Algorithm 1) (i) The space required to store the stage-stations in each region is O(MlogM), and the time complexity of inserting all the stage-stations into all the regions is O(M(logM)2). (ii) The time complexity of finding RaandRb is O(loglogM). (iii) For each consumer, the time complexity of finding the candidate stage-stations for him/her is O(logM).
Algorithm 1. Ternary-dividing (hierarchy on regions) for candidate stations.
Input: The number of all the stage-stations, M; the locations of all the stage-stations; the number of all the grid-warehouses, G; the locations of all the grid-warehouses; the number of candidate stage-stations the platform wishes for each consumer, Z.
Output:
1: Complete the dividing of regions, record the total amount of regions as l0, i.e., l0=∑k=Lk=09k.
2: Insert all the stage-stations into the regions, record the stage-stations in each region. The number of stage-stations in a region Rf is stored as NS(Rf).
3: for each consumer i≤Ndo
4: Ri=R0,P=[R0]
5: whileLev(Ri)<Ldo
6: According to the consumer’s location (xCi,yCi), find one of the son-regions of Ri in which (xCi,yCi) falls. The target is denoted as RS.
7: Ri=RS,P=[P,RS].
8: end while
9: Obtain the numbers of stage-stations in each region of P and find two regions Ra and Rb that satisfy NS(Ra)=max and NS(R_{b}) = \min\left\{NS(R) \ge Z | R \in P\right\} .
10: If NS(R_{a}) < M , we make all the stage-stations in R_{a} be candidate stage-stations, otherwise we make the stage-stations in R_{b} be candidates.
We provide an example to demonstrate the process of Algorithm 1 in Supporting Information for better comprehension. It is worth noting that we introduce a new number Z. It is decided by the platform according to the distribution of stage-stations. The purpose of obtaining R_{a} and R_{b} is to prevent the case of taking all M stage-stations as candidates, which will increased time in later selection. The following result shows a disadvantage of ternary-dividing if we insistently want more than Z stage-stations to be candidates and R_{a},R_{b} are not set.
Theorem 3. Ternary-dividing is more likely to confront the case of selecting all M stage-stations if we insistently want more than Z stage-stations to be candidates and \dfrac{Z - 1}{M} \leqslant 0.1732 .
With R_{a} and R_{b} , the disadvantage of ternary-dividing shown in Theorem 3 can be eliminated. The setting of R_{a} and R_{b} obtains a balance between choosing more candidates and saving time.
So far, we have presented and proved the algorithm of selecting candidate stage-stations among all M stations when a consumer arrives. One may doubt whether the chosen stage-stations are close enough to the consumer. In fact, with Algorithm 1, stage-stations far away from the consumers are unlikely to be chosen at all. The reason behind this is that the searching paths of a consumer and a far away stage-stations are too different to be covered by R_{a} or R_{b} .
Note that we only need to recommend one station to a consumer. Therefore, how to further refine the candidate stations and choose one of them to complete the recommendation is of great importance, which is what we will analyze below.
4.3
The hierarchy of grid-warehouses
To refine the candidate stations and choose a suitable station for recommendation, we deploy the greedy strategy. Specifically, for every candidate station, we find the closest in-stock grid-warehouse and record its distance to this candidate station. Then, we compare all the recorded distances and find the smallest one. The corresponding station will be recommended to the consumer. However, here is the problem: How could we find the closest in-stock warehouse for a candidate station? The simplest way of iterating over all warehouses is too time-consuming to be employed. Therefore, we need a better algorithm to accomplish the finding. We propose the hierarchy of grid-warehouses to solve the above problem.
We first build the data structure. The following building process is conducted for every stage-station. We take one station S_{j} as an example.
There are G grid-warehouses in [0, 1]^{2} , which are denoted as W_{1}, W_{2},\cdots, W_{G} . Their distances between grid-warehouses S_{j} are denoted as {\rm dis}_{j1}, {\rm dis}_{j2},\cdots,{\rm dis}_{jG} . Without loss of generality, we can assume that {\rm dis}_{j1} \leqslant {\rm dis}_{j2}\leqslant \cdots \leqslant {\rm dis}_{jG} . Such assumption is reasonable, since we can always conduct the sorting algorithm after the calculation of distances. The first array of S_{j} is denoted as Arr_{j1} = \left\{1, 2, 3, \cdots, G\right\} . We store
where {IV}_{l} is the inventory of grid-warehouse W_{l} . We next recursively split each array into two parts and create at most two new arrays until the array is with length 1. Fig. 5 demonstrates this process for G = 6 . We call the above data structure as the “hierarchy tree of warehouses” for S_{j} . Next, we present the hierarchy and greedy algorithm for the recommended station (Algorithm 2) with pseudocode and the time complexity of this algorithm in Theorem 4.
Theorem 4. (The time complexity of Algorithm 2) (i) The time complexities of searching for a warehouse for a consumer’s order and updating the inventory information of a warehouse with Algorithm 2 are O(M{\rm log}G) and O(M{\rm log}G). (ii) The time complexities of searching for a warehouse for a consumer’s order and updating the inventory information of a warehouse by looking through all the warehouses are O(MG) and O(1) .
Algorithm 2 Hierarchy and greedy algorithm for the recommended station.
Input: The set of candidate stations, E.
Output: One of the candidate stations as the recommended one, denoting it as S^{*} .
1: D = \emptyset . D is a sequence to store the distances between candidates stations and their corresponding closest in-stock warehouses.
2: for every S_{j} \in E do
3: Arr_{j} = Arr_{j1}
4: while |Arr_{j}| > 1 do Arr_{L} = the left-son-sequence of Arr_{j} , Arr_{R} = the right-son-sequence of Arr_{j} .
5: if \sum\nolimits_{l\in Arr_L}IV_l \ge 1 then Arr_j = Arr_L .
6: else Arr_j = Arr_R
7: end if
8: end while
9: Take out the warehouse in Arr_j , compute its distance to S_j and denote the distance as d_{Sj}, D = [D, d_{Sj}] .
10: end for
11: S^*={\rm argmin}_{S_j\in E}\left\{d_{S_j}\in D\right\} . Assign the demand to the corresponding warehouse of S^* , which is denoted as W^* .
We provide an example to demonstrate the process of Algorithm 2 in Supporting Information for better comprehension. Without the hierarchy tree of warehouses, for each candidate station, we iterate for all the warehouses to find the closest in-stock warehouse. The time complexity is obviously O\left(MG\right) . However, since there are no hierarchy trees of warehouses, we only need to update the inventory information for once. The time complexity will be O\left(1\right) .
Theorem 4 tells us that Algorithm 2 actually makes a more reasonable allocation and reaches a better balance between the time of searching and updating. The above algorithm is conducted for every consumer’s candidate stations. Upon the conduction, we allocate one consumer’s demand to a stage-station, and the demand is planned to be satisfied from a certain warehouse.
So far, with Algorithms 1 and 2, we have approaches to select candidate stations and choose one station for recommendation. In the following section, we will conduct simulations to make comparisons and show the reasonableness of our algorithms.
5.
Simulation study
5.1
Comparisons of running speed between binary- and ternary-dividing
We have stated and proved that ternary-dividing possesses faster running speed in inserting and searching than binary-dividing. In this subsection, we conduct simulations to demonstrate this conclusion. For each M, we randomly generate 10^{7} points and find the searching paths for them. 10^{7} is a sufficiently large number that can eliminate the contingency. The results are summarized in Table 2.
As we can see, ternary-dividing can save approximately 20% to 30% of the time in inserting and searching than binary-dividing. It is worth noting that although our algorithms are proposed under the background of CGB, the processes of searching and inserting can be applied to many other circumstances such as food delivery, the matching of sharing bicycles, and ride-hailing services.
5.2
Comparison of delivery costs
After conducting Algorithms 1 and 2 for all consumers, the platform obtains delivery plans. With the current operation of recommending the closest station to a consumer, there will also be such plans. Without these plans, the platform needs to solve a vehicle routing problem with multiple depots since it needs to deliver the goods from G grid-warehouses to M stage-stations, which is time-consuming. However, if the platform sticks to the plans, it only needs to solve TSP problems of small scales for G warehouses. For example, if G = 10,\; M = 80 , on average, the scale of the TSP problems of warehouses will merely be 8 . In the following comparisons, we assume that the platform will stick to the plans. We also assume that IV_{g} = \dfrac{N}{G}, \forall g \in \left\{1, 2, 3, \cdots , G\right\} , i.e., the inventories in warehouses are just enough to satisfy all the demands. Hereafter, with some abuse of notation, we denote \dfrac{N}{G} as IV . We set N = 1000 in all of our simulations. Such assumptions help us highlight the key points of our work.
The specific steps of one time of simulation are as follows:
Step 1. Set M, G, and Z.
Step 2. Generate the positions of all stage-stations and warehouses. Conduct the ternary-dividing algorithm and build the hierarchy trees of warehouses for each stage-station.
Step 3. Generate and record the positions of N consumers. The consumers arrive in sequence. For each consumer, conduct Algorithm 1 to find candidate stations for him/her. For each set of candidate stations chosen for a consumer, conduct Algorithm 2 to find the warehouse. After iterating over all consumers, store the delivery plans.
Step 4. Iterate over all consumers in sequence again. This time, for each consumer, we find the closest stage-station as the only candidate station. Conduct Algorithm 2 on the only candidate station to find the warehouse. After iterating over all consumers, the delivery plans are obtained.
Step 5. Solve the TSP problems deduced from Step 3 and Step 4. Compare the results of travel costs.
For simplification, in the following contexts, we use Method 1 to indicate the method integrated with Algorithms 1 and 2, and Method 2 for the common method of finding the closest station to a consumer. We take one of our simulations, where M=24,\;G=4,\;Z=5 , as an example, and present the delivery plans obtained from Methods 1 and 2 in Tables 3 and 4. As we can see, the delivery plan we obtain from Method 2 involves more stage-stations, while that from Method 1 makes the demands more concentrated. This is intuitive since the consumers’ positions are randomly generated. If we only find the closest available station for a consumer, it certainly leads to a more dispersed distribution of demands. The comparison of delivery plans helps us understand why our algorithms can decrease the delivery costs. Figs. 6 and 7 are the routes obtained from solving the TSP problems deduced from the plans, giving us a more straightforward demonstration of the differences.
Table 5 shows the results of some of our simulations. Column ‘Cost 1’ contains the costs of simulations with Method 1, and ‘Cost 2’ contains those with Method 2. From Table 5, we can acknowledge that Method 1 can effectively reduce the delivery costs. Fig. 8 demonstrates the results of different pairs of \left(M,G\right) . For each pair of \left(M,G\right) , we conduct a simulation with Method 1 for 100 times and obtain the average cost.
As we can see, for a fixed G, the increasing of M can reduce the delivery costs, which is intuitive since the more stage-stations there are, the more choices the platform has when building the set of candidates. However, we also find that the increasing of G may increase the delivery costs, which is counterintuitive. The reason behind this is that the increase in G makes inventories more dispersed. Algorithm 2 finds the closest in-stock warehouse for a candidate station, and the dispersion of warehouses makes more warehouses involved in the delivery plan, which increases the costs.
5.3
Comparison of consumers’ travel costs
In addition to the delivery costs, we also have to consider how our algorithms will affect consumers’ travel costs, i.e., the distances between them and the stage-stations the platform assigns for them. We calculate the average consumer travel costs of the 100 simulations for each pair of \left(M,G\right) , and Fig. 9 demonstrates the results.
Figure
9.
Average travel cost under different methods.
As we can observe from Fig. 9, the increasing of M affects the average consumer travel costs in opposite directions. When Method 1 is deployed, the increasing of M will increase consumer travel costs. The reason behind this is that more stage-stations make it more convenient for the platform to reduce delivery costs, but less considerable to consumers. The increasing of G can reduce consumer travel costs when M is fixed, which is opposite to the result of how G affects the platform’s delivery costs. This is because the dispersion of warehouses increases the number of trips and involves more stage-stations, which benefits consumers. From Fig. 9b, we know that the increasing of M can decrease consumer travel costs and that G is irrelevant when Method 2 is deployed. These results are intuitive since the more stage-stations there are, the more likely it is that there exists a station very close to a consumer, and the amounts of warehouses are relatively less important if we only consider the closest stage-station to a consumer.
6.
Robustness check
In this section, we conduct robustness check to verify the effectiveness of our algorithms. In previous sections, we assume that all the consumers will absolutely accept the recommendations of the platform. Now we remove this assumption and consider heterogeneous consumers. We assume that every consumer has a likelihood of acceptance and denote it as p, which means the probability of the consumer accepting the platform’s recommendation. If the consumer does not accept the recommendation, we assume that he wishes for choosing the closest stage-station. We set the p of all the consumers to be uniformly distributed on \left[0, 1\right] . In later simulations, the p of every consumer is randomly generated. Table 6 shows some results of random generation of p as examples.
Table
6.
Different degrees of likelihood of different consumers.
Other parts of the simulations in this section are same with Section 5 and 6. The delivery costs are shown in Table 7. We can find that nearly all the delivery costs in Table 7 are less than Cost 2 in Table 5, which proves that our algorithms are robust even after considering the possibility of consumers not accepting the platform’s recommendations.
In the context of rapid e-commerce development and motivated by the emergence of a business model called community group buying (CGB) in China and some countries in Southeast Asia, our paper establishes two hierarchy algorithms that help CGB platforms make better decisions on the dynamic matching of consumers and stage-stations and facilitate the updating of information. We conclude our results from the following three aspects.
First, we propose the hierarchy algorithm on stage-stations, which is based on Ref. [6]. We alter the fundamental design of the algorithm from binary-dividing to ternary-dividing. We prove that ternary-dividing can reach a faster running speed and present the results of simulations to highlight this point. We also mathematically show that ternary-dividing is a better data structure that possesses higher space efficiency.
Second, we build the hierarchy algorithm on grid-warehouses. Such an algorithm can be constructed results from the relatively static characteristics of stage-stations and grid-warehouses. With this algorithm, we utilize more storage space but obtain a better allocation of time on searching and updating. Specifically, without this algorithm, the time complexities of searching and updating are O\left(MG\right) and O\left(1\right) , but we change them to be O\left(M{\rm log}G\right) and O\left(M{\rm log}G\right) , i.e., we replace two extremes with two moderations and make an overall improvement.
Third, we conduct simulations to compare our algorithms with the current usual operation. We observe that our algorithms can reduce the platform’s delivery costs by concentrating the demands to fewer stage-stations, which makes the sizes of delivery problems smaller. We also obtain an interesting result that the more grid-warehouses, the higher the delivery costs. This is because more grid-warehouses will disperse the inventories and make problems more complicated. Our results offer suggestions to the CGB platform on managing the amounts of stage-stations and grid-warehouses.
Although the algorithms proposed in this paper are motivated by CGB platforms, they can also be utilized in other industries that involve the matching of demands and supplies. Examples include ride-sharing platforms with drivers and consumers, the food-delivery platforms with delivery workers and customers, and online-shopping platforms with package senders and receivers. Our algorithms and the data structures can help them to store, update, search, and control all kinds of information.
Our study has several limitations and can be extended for future research. First, in our simulations, we assume that the consumers would definitely accept the platform’s recommendation about the stage-stations. However, in reality, consumers may prefer searching and scanning other stage-stations that are not recommended and choose another station for picking up goods. Second, we assume that all the stage-stations, grid-warehouses, and consumers are uniformly distributed in \left[0,1\right]^2 , which can be improved with other more general random distributions. Third, we do not consider the behavior of stage-stations. Noting that the consumers will accept service from the owners of stage-stations (i.e., the agents of the platforms) and, there may be competition among these agents, which may affect the choice of consumers and should be considered by the platform while making recommendations. Future research can take the above into consideration and develop better algorithms that could handle more complicated settings.
Acknowledgements
This work was supported by the National Natural Science Foundation of China (71991464, 71921001) and Fundamental Research Funds for the Central Universities, General Program (WK2040000053) and Key Program (YD2040002027).
Conflict of interest
The author declares that he has no conflict of interest.
We develop algorithms to help community group buying platforms compress the time complexities of matching a consumer with a stage-stations and updating the storage information.
With simulations, we make comparisons between our algorithms and the currently usual methods of community group buying platforms and find that our approaches can effectively reduce delivery costs.
With more stock keeping uinits, a community group buying platform may bear higher delivery costs.
Yu B, Shan W, Sheu J B, et al. Branch-and-price for a combined order selection and distribution problem in online community group-buying of perishable products. Transportation Research Part B: Methodological, 2022, 158: 341–373. DOI: 10.1016/j.trb.2022.03.001
[2]
Chatterjee S, Peled R, Peres Y, et al. Gravitational allocation to Poisson points. Annals of Mathematics, 2010, 172 (1): 617–671. DOI: 10.4007/annals.2010.172.617
[3]
Holden N, Peres Y, Zhai A. Gravitational allocation on the sphere. Proceedings of the National Academy of Sciences, 2018, 115 (39): 9666–9671. DOI: 10.1073/pnas.1720804115
[4]
Ajtai M, Komlós, J, Tusnády G. On optimal matchings. Combinatorica, 1984, 4: 259–264. DOI: 10.1007/BF02579135
[5]
Talagrand M. Matching random samples in many dimensions. The Annals of Applied Probability, 1992, 2 (4): 846–856. DOI: 10.1214/aoap/1177005578
[6]
Kanoria Y. Dynamic spatial matching. arXiv: 2105.07329, 2021 .
[7]
Hoffman C, Holroyd A E, Peres Y. A stable marriage of Poisson and Lebesgue. The Annals of Probability, 2006, 34 (4): 1241–1272. DOI: 10.1214/009117906000000098
[8]
Markó R, Timár Á. A Poisson allocation of optimal tail. The Annals of Probability, 2016, 44 (2): 1285–1307. DOI: 10.1214/15-AOP1001
[9]
Shor P W. The average-case analysis of some on-line algorithms for bin packing. Combinatorica, 1986, 6: 179–200. DOI: 10.1007/BF02579171
[10]
Shor P W. How to pack better than best fit: tight bounds for average-case online bin packing. In: Proceedings 32nd Annual Symposium of Foundations of Computer Science. San Juan, PR, USA: IEEE, 1991 : 752–759.
[11]
Akbarpour M, Alimohammadi Y, Li S, et al. The value of excess supply in spatial matching markets. arXiv: 2104.03219, 2021 .
[12]
Sun S, Zhang B. Operation strategies for nanostore in community group buying. Omega, 2022, 110: 102636. DOI: 10.1016/j.omega.2022.102636
[13]
Li J, Li B, Shen Y, et al. Study on the steady state of the propagation model of consumers’ perceived service quality in the community group-buying. Journal of Retailing and Consumer Services, 2022, 65: 102882. DOI: 10.1016/j.jretconser.2021.102882
[14]
Li B, Krushinsky D, Reijers H A, et al. The share-a-ride problem: People and parcels sharing taxis. European Journal of Operational Research, 2014, 238 (1): 31–40. DOI: 10.1016/j.ejor.2014.03.003
[15]
Qi W, Li L, Liu S, et al. Shared mobility for last-mile delivery: Design, operational prescriptions, and environmental impact. Manufacturing & Service Operations Management, 2018, 20 (4): 737–751. DOI: 10.1287/msom.2017.0683
[16]
Agussurja L, Cheng S F, Lau H C. A state aggregation approach for stochastic multiperiod last-mile ride-sharing problems. Transportation Science, 2019, 53 (1): 148–166. DOI: 10.1287/trsc.2018.0840
[17]
Cao J, Olvera-Cravioto M, Shen Z J. Last-mile shared delivery: A discrete sequential packing approach. Mathematics of Operations Research, 2020, 45 (4): 1466–1497. DOI: 10.1287/moor.2019.1039
[18]
Mousavi K, Bodur M, Roorda M J. Stochastic last-mile delivery with crowd-shipping and mobile depots. Transportation Science, 2022, 56 (3): 612–630. DOI: 10.1287/trsc.2021.1088
[19]
Lyu G, Teo C P. Last mile innovation: The case of the Locker Alliance network. Manufacturing & Service Operations Management, 2022, 24 (5): 2425–2443. DOI: 10.1287/msom.2021.1000
[20]
Reed S, Campbell A M, Thomas B W. Impact of autonomous vehicle assisted last-mile delivery in urban to rural settings. Transportation Science, 2022, 56 (6): 1530–1548. DOI: 10.1287/trsc.2022.1142
Figure
4.
The cubes in the deepest level for M = 600 .
Figure
5.
The hierarchy process of G=6 .
Figure
6.
Delivery routes of Warehouses 1 and 2.
Figure
7.
Delivery routes of Warehouses 3 and 4.
Figure
8.
The average cost of different pairs of (M, G).
Figure
9.
Average travel cost under different methods.
Figure
10.
The new sequence of events.
References
[1]
Yu B, Shan W, Sheu J B, et al. Branch-and-price for a combined order selection and distribution problem in online community group-buying of perishable products. Transportation Research Part B: Methodological, 2022, 158: 341–373. DOI: 10.1016/j.trb.2022.03.001
[2]
Chatterjee S, Peled R, Peres Y, et al. Gravitational allocation to Poisson points. Annals of Mathematics, 2010, 172 (1): 617–671. DOI: 10.4007/annals.2010.172.617
[3]
Holden N, Peres Y, Zhai A. Gravitational allocation on the sphere. Proceedings of the National Academy of Sciences, 2018, 115 (39): 9666–9671. DOI: 10.1073/pnas.1720804115
[4]
Ajtai M, Komlós, J, Tusnády G. On optimal matchings. Combinatorica, 1984, 4: 259–264. DOI: 10.1007/BF02579135
[5]
Talagrand M. Matching random samples in many dimensions. The Annals of Applied Probability, 1992, 2 (4): 846–856. DOI: 10.1214/aoap/1177005578
[6]
Kanoria Y. Dynamic spatial matching. arXiv: 2105.07329, 2021 .
[7]
Hoffman C, Holroyd A E, Peres Y. A stable marriage of Poisson and Lebesgue. The Annals of Probability, 2006, 34 (4): 1241–1272. DOI: 10.1214/009117906000000098
[8]
Markó R, Timár Á. A Poisson allocation of optimal tail. The Annals of Probability, 2016, 44 (2): 1285–1307. DOI: 10.1214/15-AOP1001
[9]
Shor P W. The average-case analysis of some on-line algorithms for bin packing. Combinatorica, 1986, 6: 179–200. DOI: 10.1007/BF02579171
[10]
Shor P W. How to pack better than best fit: tight bounds for average-case online bin packing. In: Proceedings 32nd Annual Symposium of Foundations of Computer Science. San Juan, PR, USA: IEEE, 1991 : 752–759.
[11]
Akbarpour M, Alimohammadi Y, Li S, et al. The value of excess supply in spatial matching markets. arXiv: 2104.03219, 2021 .
[12]
Sun S, Zhang B. Operation strategies for nanostore in community group buying. Omega, 2022, 110: 102636. DOI: 10.1016/j.omega.2022.102636
[13]
Li J, Li B, Shen Y, et al. Study on the steady state of the propagation model of consumers’ perceived service quality in the community group-buying. Journal of Retailing and Consumer Services, 2022, 65: 102882. DOI: 10.1016/j.jretconser.2021.102882
[14]
Li B, Krushinsky D, Reijers H A, et al. The share-a-ride problem: People and parcels sharing taxis. European Journal of Operational Research, 2014, 238 (1): 31–40. DOI: 10.1016/j.ejor.2014.03.003
[15]
Qi W, Li L, Liu S, et al. Shared mobility for last-mile delivery: Design, operational prescriptions, and environmental impact. Manufacturing & Service Operations Management, 2018, 20 (4): 737–751. DOI: 10.1287/msom.2017.0683
[16]
Agussurja L, Cheng S F, Lau H C. A state aggregation approach for stochastic multiperiod last-mile ride-sharing problems. Transportation Science, 2019, 53 (1): 148–166. DOI: 10.1287/trsc.2018.0840
[17]
Cao J, Olvera-Cravioto M, Shen Z J. Last-mile shared delivery: A discrete sequential packing approach. Mathematics of Operations Research, 2020, 45 (4): 1466–1497. DOI: 10.1287/moor.2019.1039
[18]
Mousavi K, Bodur M, Roorda M J. Stochastic last-mile delivery with crowd-shipping and mobile depots. Transportation Science, 2022, 56 (3): 612–630. DOI: 10.1287/trsc.2021.1088
[19]
Lyu G, Teo C P. Last mile innovation: The case of the Locker Alliance network. Manufacturing & Service Operations Management, 2022, 24 (5): 2425–2443. DOI: 10.1287/msom.2021.1000
[20]
Reed S, Campbell A M, Thomas B W. Impact of autonomous vehicle assisted last-mile delivery in urban to rural settings. Transportation Science, 2022, 56 (6): 1530–1548. DOI: 10.1287/trsc.2022.1142
Algorithm 1. Ternary-dividing (hierarchy on regions) for candidate stations.
Input: The number of all the stage-stations, M; the locations of all the stage-stations; the number of all the grid-warehouses, G; the locations of all the grid-warehouses; the number of candidate stage-stations the platform wishes for each consumer, Z.
Output:
1: Complete the dividing of regions, record the total amount of regions as l_{0} , i.e., l_{0} = \sum_{k = 0}^{k = L}9^{k} .
2: Insert all the stage-stations into the regions, record the stage-stations in each region. The number of stage-stations in a region R_{f} is stored as NS(R_{f}) .
3: for each consumer i \le N do
4: R_{i} = R_{0}, P = [R_{0}]
5: while {\rm Lev}(R_{i})< L do
6: According to the consumer’s location (x_{i}^{C}, y_{i}^{C}) , find one of the son-regions of R_{i} in which (x_{i}^{C}, y_{i}^{C}) falls. The target is denoted as R_{S} .
7: R_{i} = R_{S}, P = [P, R_{S}] .
8: end while
9: Obtain the numbers of stage-stations in each region of P and find two regions R_{a} and R_{b} that satisfy NS(R_{a}) = \max\left\{NS(R) \le Z | R \in P\right\} and NS(R_{b}) = \min\left\{NS(R) \ge Z | R \in P\right\} .
10: If NS(R_{a}) < M , we make all the stage-stations in R_{a} be candidate stage-stations, otherwise we make the stage-stations in R_{b} be candidates.