Skip to content

Failover Mekanizmaları

Tanım: Sistemdeki bir bileşenin başarısız olması durumunda trafiği başka bir bileşene yönlendirme sürecidir.

Failover Türleri

Aktif-Pasif (Hot Standby)

  • Yedek bileşen sürekli hazır durumda bekler
  • Veri senkronizasyonu gerçek zamanlı yapılır
  • Failover süresi minimum seviyededir
  • Örnek: PostgreSQL streaming replication

Aktif-Aktif (Active-Active)

  • Birden fazla bileşen aynı anda trafik yönetir
  • Yük dengeleme ile birlikte kullanılır
  • Veri tutarlılığı için özel mekanizmalar gerekir
  • Örnek: Redis Cluster, MongoDB Replica Set

Warm Standby

  • Yedek bileşen hazır ama tam senkronize değil
  • Failover süresi daha uzundur
  • Maliyet açısından daha uygundur
  • Örnek: AWS RDS Multi-AZ deployment

Failover Stratejileri

Otomatik Failover

  • Sistem otomatik olarak yedek bileşene geçiş yapar
  • Health check mekanizmaları ile tetiklenir
  • Minimum insan müdahalesi gerektirir

Manuel Failover

  • Operatör kontrolünde geçiş yapılır
  • Planlı bakım için kullanılır
  • Risk yönetimi daha kolaydır

Spring Boot Implementasyonları

Database Failover

Multi-DataSource Configuration

java
@Configuration
public class DatabaseFailoverConfig {
    
    @Primary
    @Bean("primaryDataSource")
    @ConfigurationProperties("spring.datasource.primary")
    public DataSource primaryDataSource() {
        return DataSourceBuilder.create().build();
    }
    
    @Bean("secondaryDataSource")
    @ConfigurationProperties("spring.datasource.secondary")
    public DataSource secondaryDataSource() {
        return DataSourceBuilder.create().build();
    }
    
    @Bean
    public DataSource routingDataSource() {
        RoutingDataSource routingDataSource = new RoutingDataSource();
        
        Map<Object, Object> dataSourceMap = new HashMap<>();
        dataSourceMap.put("primary", primaryDataSource());
        dataSourceMap.put("secondary", secondaryDataSource());
        
        routingDataSource.setTargetDataSources(dataSourceMap);
        routingDataSource.setDefaultTargetDataSource(primaryDataSource());
        
        return routingDataSource;
    }
}

public class RoutingDataSource extends AbstractRoutingDataSource {
    
    @Override
    protected Object determineCurrentLookupKey() {
        return DatabaseContextHolder.getDatabase();
    }
}

@Component
public class DatabaseContextHolder {
    
    private static final ThreadLocal<String> contextHolder = new ThreadLocal<>();
    
    public static void setDatabase(String database) {
        contextHolder.set(database);
    }
    
    public static String getDatabase() {
        return contextHolder.get();
    }
    
    public static void clearDatabase() {
        contextHolder.remove();
    }
}

Database Health Monitoring

java
@Component
@Slf4j
public class DatabaseFailoverService {
    
    private final DataSource primaryDataSource;
    private final DataSource secondaryDataSource;
    private volatile boolean primaryHealthy = true;
    
    @Scheduled(fixedDelay = 5000)
    public void checkDatabaseHealth() {
        try {
            checkPrimaryDatabase();
            if (!primaryHealthy) {
                log.info("Primary database recovered, switching back");
                DatabaseContextHolder.setDatabase("primary");
                primaryHealthy = true;
            }
        } catch (Exception e) {
            if (primaryHealthy) {
                log.warn("Primary database failed, switching to secondary", e);
                DatabaseContextHolder.setDatabase("secondary");
                primaryHealthy = false;
            }
        }
    }
    
    private void checkPrimaryDatabase() throws SQLException {
        try (Connection connection = primaryDataSource.getConnection()) {
            connection.createStatement().execute("SELECT 1");
        }
    }
}

Service-to-Service Failover

Resilience4j Circuit Breaker ile Failover

java
@Configuration
public class CircuitBreakerConfig {
    
    @Bean
    public CircuitBreaker userServiceCircuitBreaker() {
        return CircuitBreaker.ofDefaults("user-service");
    }
    
    @Bean
    public TimeLimiter userServiceTimeLimiter() {
        return TimeLimiter.of(Duration.ofSeconds(3));
    }
}

@Service
public class UserService {
    
    @CircuitBreaker(name = "user-service", fallbackMethod = "fallbackGetUser")
    @TimeLimiter(name = "user-service")
    @Retry(name = "user-service")
    public CompletableFuture<User> getUser(Long userId) {
        return CompletableFuture.supplyAsync(() -> {
            return primaryUserServiceClient.getUser(userId);
        });
    }
    
    // Fallback: Secondary service'e geç
    public CompletableFuture<User> fallbackGetUser(Long userId, Exception e) {
        log.warn("Primary user service failed, using secondary: {}", e.getMessage());
        return CompletableFuture.supplyAsync(() -> {
            return secondaryUserServiceClient.getUser(userId);
        });
    }
    
    // Son çare: Cache'den döndür
    public CompletableFuture<User> cacheFallbackGetUser(Long userId, Exception e) {
        log.warn("All user services failed, trying cache: {}", e.getMessage());
        return CompletableFuture.supplyAsync(() -> {
            User cachedUser = userCacheService.getUser(userId);
            if (cachedUser != null) {
                return cachedUser;
            }
            throw new UserNotFoundException("User not found in any source");
        });
    }
}

Service Discovery ile Automatic Failover

java
@Configuration
@EnableDiscoveryClient
public class ServiceDiscoveryConfig {
    
    @Bean
    @LoadBalanced
    public RestTemplate restTemplate() {
        return new RestTemplate();
    }
    
    @Bean
    public DiscoveryClient discoveryClient() {
        return new EurekaDiscoveryClient();
    }
}

@Service
public class FailoverAwareServiceClient {
    
    @Autowired
    private DiscoveryClient discoveryClient;
    
    @Autowired
    @LoadBalanced
    private RestTemplate restTemplate;
    
    public ResponseEntity<String> callService(String serviceName, String endpoint) {
        List<ServiceInstance> instances = discoveryClient.getInstances(serviceName);
        
        for (ServiceInstance instance : instances) {
            try {
                String url = instance.getUri() + endpoint;
                return restTemplate.getForEntity(url, String.class);
            } catch (Exception e) {
                log.warn("Service instance {} failed: {}", instance.getUri(), e.getMessage());
                // Bir sonraki instance'ı dene
                continue;
            }
        }
        
        throw new ServiceUnavailableException("All instances of " + serviceName + " are down");
    }
}

Application Server Failover

Spring Boot ile Health Check Endpoints

java
@Component
public class CustomHealthIndicator implements HealthIndicator {
    
    @Autowired
    private DatabaseHealthService databaseHealthService;
    
    @Autowired
    private ExternalServiceHealthService externalServiceHealthService;
    
    @Override
    public Health health() {
        boolean isDatabaseHealthy = databaseHealthService.isHealthy();
        boolean areExternalServicesHealthy = externalServiceHealthService.areHealthy();
        
        if (isDatabaseHealthy && areExternalServicesHealthy) {
            return Health.up()
                .withDetail("database", "UP")
                .withDetail("external-services", "UP")
                .build();
        } else {
            return Health.down()
                .withDetail("database", isDatabaseHealthy ? "UP" : "DOWN")
                .withDetail("external-services", areExternalServicesHealthy ? "UP" : "DOWN")
                .build();
        }
    }
}

@RestController
public class HealthController {
    
    @Autowired
    private HealthEndpoint healthEndpoint;
    
    @GetMapping("/health/ready")
    public ResponseEntity<Map<String, Object>> readiness() {
        Health health = healthEndpoint.health();
        
        if (health.getStatus() == Status.UP) {
            return ResponseEntity.ok(Map.of("status", "ready"));
        } else {
            return ResponseEntity.status(HttpStatus.SERVICE_UNAVAILABLE)
                .body(Map.of("status", "not-ready", "details", health.getDetails()));
        }
    }
    
    @GetMapping("/health/live")
    public ResponseEntity<Map<String, Object>> liveness() {
        // Basit liveness check - uygulama çalışıyor mu?
        return ResponseEntity.ok(Map.of("status", "alive"));
    }
}

Load Balancer Integration

NGINX ile Health Check ve Failover

nginx
upstream backend_servers {
    server app1.example.com:8080 max_fails=3 fail_timeout=30s;
    server app2.example.com:8080 max_fails=3 fail_timeout=30s;
    server app3.example.com:8080 backup; # Backup server
}

server {
    listen 80;
    
    location /health {
        access_log off;
        return 200 "healthy\n";
        add_header Content-Type text/plain;
    }
    
    location / {
        proxy_pass http://backend_servers;
        proxy_next_upstream error timeout invalid_header http_500 http_502 http_503;
        proxy_connect_timeout 5s;
        proxy_send_timeout 10s;
        proxy_read_timeout 10s;
        
        # Health check headers
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
}

AWS Application Load Balancer ile Failover

yaml
# ALB Target Group Health Check
TargetGroup:
  Type: AWS::ElasticLoadBalancingV2::TargetGroup
  Properties:
    HealthCheckEnabled: true
    HealthCheckIntervalSeconds: 30
    HealthCheckPath: /actuator/health
    HealthCheckPort: 8080
    HealthCheckProtocol: HTTP
    HealthCheckTimeoutSeconds: 5
    HealthyThresholdCount: 2
    UnhealthyThresholdCount: 3
    Matcher:
      HttpCode: 200
    Port: 8080
    Protocol: HTTP
    VpcId: !Ref VPC

Cache Failover Strategies

Redis Cluster Failover

java
@Configuration
public class RedisFailoverConfig {
    
    @Bean
    public LettuceConnectionFactory redisConnectionFactory() {
        RedisClusterConfiguration clusterConfig = new RedisClusterConfiguration();
        clusterConfig.clusterNode("redis-1.example.com", 6379);
        clusterConfig.clusterNode("redis-2.example.com", 6379);
        clusterConfig.clusterNode("redis-3.example.com", 6379);
        
        LettuceClientConfiguration clientConfig = LettuceClientConfiguration.builder()
            .commandTimeout(Duration.ofSeconds(2))
            .build();
            
        return new LettuceConnectionFactory(clusterConfig, clientConfig);
    }
    
    @Bean
    public RedisTemplate<String, Object> redisTemplate() {
        RedisTemplate<String, Object> template = new RedisTemplate<>();
        template.setConnectionFactory(redisConnectionFactory());
        template.setDefaultSerializer(new GenericJackson2JsonRedisSerializer());
        return template;
    }
}

@Service
public class CacheFailoverService {
    
    @Autowired
    private RedisTemplate<String, Object> redisTemplate;
    
    @Cacheable(value = "users", unless = "#result == null")
    public User getUser(Long userId) {
        try {
            return redisTemplate.opsForValue().get("user:" + userId);
        } catch (Exception e) {
            log.warn("Redis cache failed, falling back to database: {}", e.getMessage());
            return userRepository.findById(userId).orElse(null);
        }
    }
}

Multi-Level Cache Failover

java
@Service
public class MultiLevelCacheService {
    
    @Autowired
    private RedisTemplate<String, Object> redisTemplate;
    
    private final ConcurrentHashMap<String, Object> localCache = new ConcurrentHashMap<>();
    
    public User getUser(Long userId) {
        String key = "user:" + userId;
        
        // Level 1: Local cache
        User user = (User) localCache.get(key);
        if (user != null) {
            return user;
        }
        
        // Level 2: Redis cache
        try {
            user = (User) redisTemplate.opsForValue().get(key);
            if (user != null) {
                localCache.put(key, user);
                return user;
            }
        } catch (Exception e) {
            log.warn("Redis cache failed: {}", e.getMessage());
        }
        
        // Level 3: Database
        user = userRepository.findById(userId).orElse(null);
        if (user != null) {
            // Cache'lere geri yaz
            localCache.put(key, user);
            try {
                redisTemplate.opsForValue().set(key, user, Duration.ofMinutes(10));
            } catch (Exception e) {
                log.warn("Failed to update Redis cache: {}", e.getMessage());
            }
        }
        
        return user;
    }
}

Monitoring ve Alerting

Spring Boot Actuator Metrics

java
@Component
public class FailoverMetrics {
    
    private final MeterRegistry meterRegistry;
    private final Counter failoverCounter;
    
    public FailoverMetrics(MeterRegistry meterRegistry) {
        this.meterRegistry = meterRegistry;
        this.failoverCounter = Counter.builder("failover.events")
            .description("Number of failover events")
            .register(meterRegistry);
    }
    
    public void recordFailover(String component, String reason) {
        failoverCounter.increment(
            Tags.of(
                "component", component,
                "reason", reason
            )
        );
    }
}

Failover mekanizmaları, sistem güvenilirliğinin temel taşlarından biridir. Proper configuration ve monitoring ile minimum downtime sağlanabilir.

Eren Demir tarafından oluşturulmuştur.