I was reading a lot of stuff about Dirichlet Convolution and how to solve sums over arithmetic functions. There are a lot of resources and I don’t consider that it’s an easy topic. Even when I don’t use Dirichlet here, It’s something that I’ll try to practice more because it’s really useful when you’re working with Mobius function. Maybe I need to review my past codes to see if I can improve them.
Statement
\begin{aligned} G(N) = \sum_{j=1}^{N} \sum_{i=1}^{j} \gcd(i, j) \end{aligned}
Find $G(10^{11})$. Give your answer modulo $998244353$.
Solution
Naive solution
Let’s try with the simplest and naive solution, just doing a nested for:
long long G = 0;
for(long long j=1; i<=N; i++){
for(long long i=1; i<=j; i++) G += __gcd(i, j);
}
Of course, the time complexity of this solution is: $\mathcal{O}(N^{2})$ and not feasible to compute since the value for $N$ is $10^{11}$.
Linear solution
From my previous solution in a past blog, using the Mobius function it is possible to reduce the solution to $\mathcal{O}(n\log(n))$ and by applying the Mertens function optimization, it is possible to reduce it to $\mathcal{O}(n)$.
Even when this time complexity is acceptable (it could take a few hours), the memory utilization is close to the order of terabytes, so it is not efficient.
long long G = 0;
vector <int> values;
vector <pair <int,int> > range;
fill(n, values, range);
for(int i=0; i<values.size(); i++){
int v = values[i];
long long aux = 0;
vector <int> n_v;
vector <pair <int,int> > n_r;
fill(v, n_v,n_r);
for(int j=0; j<n_v.size(); j++){
int k = n_v[j];
long long s = 1LL*k*k-(1LL*k*(k+1))/2;
aux += s*(s_mobius[n_r[j].first] - s_mobius[n_r[j].second-1]);
}
long long s = (1LL*range[i].first*(range[i].first+1))/2;
s -= (1LL*range[i].second*(range[i].second-1))/2;
G += aux*s;
}
Sub-linear solution
Since even the linear solution is not acceptable for this problem. We need to find a sub-linear approach, in this case, I will try to formulate a $\mathcal{O}(n^{2/3})$ solution, which can be computed in few seconds.
First, I’ll calculate the sum of $\gcd(i, n)$:
\begin{aligned} g(n) &= \sum_{i=1}^{n} \gcd(i, n)\cr &= \sum_{k\vert n} k \phi\Big( \frac{n}{k} \Big)\cr &= (n * \phi)(n) \end{aligned}
The last line is the dirichlet notation, it could be useful.
Now, we can use $g(n)$ to calculate $G(N)$:
\begin{aligned} G(N) &= \sum_{i=1}^{N} g(i)\cr &= \sum_{i=1}^{N} \sum_{k \vert i} \phi(k) \frac{i}{k}\cr &= \sum_{k=1}^{N} \sum_{p=1}^{\frac{N}{k}} \phi(k) p\cr &= \sum_{k=1}^{N} \sum_{p=1}^{\frac{N}{k}} \phi(k) p\cr &= \sum_{k=1}^{N} \phi(k) \frac{\lfloor \frac{N}{k}\rfloor (\lfloor \frac{N}{k}\rfloor + 1)}{2}\cr \end{aligned}
As you can see, we want to calculate sums of $\phi(k)$, so let’s to reduce it:
\begin{aligned} S(n) &= \sum_{i=1}^{n} \phi(i) \cr S(n) &= \Phi(n) \cr &= \sum_{b=1}^{n} \sum_{a=1}^{b} \big\vert \gcd(a, b) = 1 \big\vert \cr &= \frac{n(n+1)}{2} - \sum_{m=2}^{n} \sum_{y=1}^{\lfloor n/m \rfloor} \sum_{x=1}^{\lfloor b/m \rfloor} \big\vert \gcd(x, y) = 1 \big\vert \cr &= \frac{n(n+1)}{2} - \sum_{m=2}^{n} S(\lfloor n/m \rfloor) \cr &= \frac{n(n+1)}{2} - \sum_{m=2}^{\lfloor \sqrt{n} \rfloor} S(\lfloor n/m \rfloor) - \sum_{m=1}^{\lfloor \frac{n}{\lfloor \sqrt{n} \rfloor} \rfloor - 1} \Big(\Big\lfloor\frac{n}{m} \Big\rfloor - \Big\lfloor\frac{n}{m+1} \Big\rfloor \Big) S(m) \end{aligned}
This is a recursive definition. It’s not hard to see that the time complexity for this is $\mathcal{O}(n^{2/3})$.
Similar to the explained in a previous post, the list of possible values of $\lfloor N/k \rfloor$ has a size of $2\sqrt{N}$. I will use the hyperbola method to calculate it:
\begin{aligned} G(N) &= \sum_{k=1}^{N} \phi(k) \frac{\lfloor \frac{N}{k}\rfloor (\lfloor \frac{N}{k}\rfloor + 1)}{2}\cr &= \sum_{k=1}^{N} \phi(k) \sum_{ab=k} a \phi(b) \cr &= \sum_{a=1}^{\lfloor \sqrt{N} \rfloor} \sum_{b=1}^{\lfloor N/a\rfloor} a\phi(b) + \sum_{b=1}^{\lfloor \sqrt{N}\rfloor} \sum_{a=1}^{\lfloor N/b\rfloor} a\phi(b) - \sum_{a=1}^{\lfloor\sqrt{N}\rfloor} \sum_{b=1}^{\lfloor\sqrt{N}\rfloor} a\phi(b)\cr &= \sum_{a=1}^{\lfloor\sqrt{N}\rfloor} a \Phi(\lfloor N/a \rfloor) + \sum_{b=1}^{\lfloor \sqrt{N} \rfloor} \phi(b) \frac{\lfloor N/b\rfloor \times (\lfloor N/b\rfloor + 1)}{2} - \frac{\lfloor \sqrt{N}\rfloor (\lfloor \sqrt{N} \rfloor + 1)}{2}\Phi(\lfloor \sqrt{N} \rfloor)\cr \end{aligned}
To optimize the complexity, suppose that we will do a sieve until $N/k$, so the overall complexity will be:
\begin{aligned} \mathcal{O}(G(N)) &= \mathcal{O}\Big(\sum_{a=1}^{k} \mathcal{O}\big(\Phi(\lfloor N/a\rfloor)\big)\Big) + \mathcal{O}(\sqrt{N}) + \mathcal{O}(N/k \log(N/k))\cr &= \mathcal{O}\Big(\sum_{a=1}^{k} 1\Big) + \mathcal{O}(N^{2/3}) + \mathcal{O}(\sqrt{N}) + \mathcal{O}(N/k \log(N/k))\cr &= \mathcal{O}(k) + \mathcal{O}(N^{2/3}) + \mathcal{O}(\sqrt{N}) + \mathcal{O}(N/k \log(N/k))\cr &\text{Optimal value is when:}\cr & k \approx N^{69/125}\cr \mathcal{O}(G(N)) &= \mathcal{O}(k + N^{2/3} + \sqrt{N} + N/k\log(N/k))\cr &= \mathcal{O}(N^{2/3}) \end{aligned}
Code
Here is the code, it’s not so fast due to mod
operations.
#include <bits/stdc++.h>
using namespace std;
using ll = long long;
#define N 10000000
#define MOD 998244353
int phi[N];
int c_Phi[N];
unordered_map <ll, int> m_Phi;
void cphi(){
phi[1] = 1;
for(int i=2; i<N; i++){
if(!phi[i]){
phi[i] = i-1;
for(ll j=2*i; j<N; j+=i){
if(phi[j] == 0) phi[j] = j;
phi[j] = phi[j]/i*(i-1);
}
}
}
c_Phi[1] = phi[1];
for(int i=2; i<N; i++) c_Phi[i] = (c_Phi[i-1] + phi[i])%MOD;
}
int T(ll n){
n %= MOD;
ll ans = n*(n+1);
ans %= MOD;
ans *= 499122177;
return ans%MOD;
}
ll Phi(ll n){
if(n < N) return c_Phi[n];
if(m_Phi.find(n) != m_Phi.end()) return m_Phi[n];
ll ans = T(n);
ll sqrt_n = floor(sqrt(n));
for(int m=1; m < n/sqrt_n; m++){
ans -= 1LL*(n/m - n/(m+1))*c_Phi[m];
ans %= MOD;
}
for(int m=2; m<=sqrt_n; m++){
ans -= Phi(n/m);
ans %= MOD;
}
ans += MOD; ans %= MOD;
m_Phi[n] = ans;
return ans;
}
ll G(ll n){
ll ans = 0;
int sqrt_n = floor(sqrt(n));
ans -= 1LL*T(sqrt_n)*Phi(sqrt_n);
ans %= MOD;
for(int i=1; i<=sqrt_n; i++){
ans += Phi(n/i)*i;
ans %= MOD;
}
for(int i=1; i<=sqrt_n; i++){
ans += 1LL*phi[i]*T(n/i);
ans %= MOD;
}
ans += MOD; ans %= MOD;
return ans;
}
int main(){
cphi();
ll n; cin >> n;
cout << G(n) << '\n';
return 0;
}
Resource:
I want to list them because I would like to go deeper into them and should review them further at my leisure.