Distributed Services with Go: accessing the service outside the pods (Beta-4 pg.223)

Once deployed, I am not able to access the service outside the prolog pods. I can run the client successfully and access the service from a client running on one of the proglog pods.

However, I am not able to access the service from the following locations:

  1. Laptop: The client connection hangs and then returns connection error while dialing dial tcp: lookup proglog-0.proglog.mysvcs.svc.cluster.local: no such host. I am dialing using the EXTERNAL-IP address of proglog-0 from k8s in the client (This address works when used from a proglog pod).

  2. From a pod in the k8s cluster but not a proglog pod. It prints the following error repeatedly ERROR: 2020/10/31 18:59:29 [core] subconn returned from pick is not *acBalancerWrapper. Also, the acBalancerWrapper error appears when running the client from a proglog pod (where it usually works) from time to time.

Anyone faced this before, ideas?
At the moment I’m looking into GetServers returning the K8s DNS names (e.g. proglog-0.proglog.default.svc.cluster.local:8400 rather than the EXTERNAL-IP of the pods.

1 Like

For 1. did you port-forward the pod/service to expose it outside k8s’ network? That’s what the $ kubectl port-forward pod/proglog-0 8400 8400 command in the book is for.

  1. Could be because leader election is still underway so a nil subconn is returned for it.
1 Like

For 2. I think the error handling should be improved. I’m looking into it.

1 Like

Yeah we should return ErrNoSubConnAvailable like so:

func (p *Picker) Pick(info balancer.PickInfo) (
	balancer.PickResult, error) {
	p.mu.RLock()
	defer p.mu.RUnlock()
	var result balancer.PickResult
	if strings.Contains(info.FullMethodName, "Produce") ||
		len(p.followers) == 0 {
		result.SubConn = p.leader
	} else if strings.Contains(info.FullMethodName, "Consume") {
		result.SubConn = p.nextFollower()
	}
	if result.SubConn == nil {
		return result, balancer.ErrNoSubConnAvailable
	}
	return result, nil
}

so gRPC will block the RPC until a new picker is available with available sub connections.

2 Likes

I’m not able to get the load balancing picker to work when in GCP, however, it works when i Port forward.

I’ve tried using the following:

  • Pod IP (results in a hang)

  • Load balancer CLUSTER-IP:8400 address as allocated by the service (results in a timeout)

  • Load balancer EXTERNAL-IP:8400 address as allocated by the service (results in a hang)

  • Using the bind address: $HOSTNAME.proglog.{{.Release.Namespace}}.svc.cluster.local:{{.Values.rpcPort}} from a different service pod in the cluster (results in an error: subconn returned from pick is not *acBalancerWrapper)

client code:

		lbAddr := fmt.Sprintf("%s:///%s", loadbalance.Name, *addr)
		conn, err = grpc.Dial(lbAddr, opts...)
		if err != nil {
			log.Fatal(err)
		}

How can we use the client to load balance on gcp without port forwarding? Is it possible?

Please excuse this, at some point i changed the following line

	if l.config.Raft.Bootstrap {
		config := raft.Configuration{
			Servers: []raft.Server{{
				ID:      config.LocalID,
				Address: raft.ServerAddress(l.config.Raft.BindAddr), // <--- i mistakenly changed this to transport.LocalAddr()
			}},
		}
		err = l.raft.BootstrapCluster(config).Error()
	}
	return err

Now when I use the bind-addr, the load balancing is working from within the K8s network!